Tips for developing a grouping algorithm that categorizes items within a dataset based on the presence of at least three shared attributes

Currently, I am in the process of creating a tool for my client that organizes keywords into groups based on similarities within the top 10 Google search URLs. Each keyword is represented as a JavaScript object containing a list of URLs. The condition for grouping two keywords together is if they share 3 or more common URLs. Furthermore, there should be no duplicates in the generated groups, and the total number of groups created is not predetermined beforehand. Any insights on refining the logic behind this problem would be greatly appreciated!

While I have crafted an algorithm below to tackle this issue, it still results in duplicates and fails to accurately group certain keywords together.

function makeKeywordGroupsNew(results: Result[], uid: string): Group[] {
  let dataset = results;
  let groups: any[] = [];

  // iterating through all records in the dataset
  dataset.forEach((current: Result) => {
    // initializing the group with the current keyword
    const group = { volume: 0, items: [current] };
    // removing current keyword from dataset
    dataset = dataset.filter(el => el.keyword !== current.keyword);
    // comparing current keyword with others to determine shared URLs
    dataset.forEach((other: Result) => {
      const urlsInCommon = _.intersection(current.urls, other.urls);
      if (urlsInCommon.length >= 3) {
        group.items.push(other);
      }
    });

    // calculating group volume - extraneous to core logic
    group.volume = _.sum(group.items.map(item => item.volume));
    // sorting keywords by volume - extraneous to core logic
    group.items = group.items
      .sort((a, b) => {
        if (a.volume < b.volume) return 1;
        if (a.volume > b.volume) return -1;
        return 0;
      })
      .map(el => el.keyword);
    
    // adding newly formed group to result array
    groups.push(group);
  });

  // filtering out single keyword groups
  groups = groups.filter(group => group.items.length > 1);
  // removing duplicate keywords in each group
  groups = groups.map(group => ({ ...group, items: _.uniq(group.items) }));
  
  return groups.map(group => ({
    uid,
    main: group.items[0],
    keywords: group.items.slice(1, group.length),
    volume: group.volume
  }));
}

I was anticipating the output from input.json to align with output.csv, but my solution either undergroups or misclassifies keywords.

Answer №1

A potential issue may arise from the method in which you are conducting filtering on the dataset array. It appears that the array is being looped through and filtered within its own loop, which could be causing complications. To address this, consider removing the current keyword from the dataset, storing it in a separate variable, and then iterating through the modified dataset instead of altering the original array directly.

Answer №2

Upon reviewing the outcome, it seems there may be an issue as there are 135 "groups" instead of the expected 87, while the original code generates 88

The problem might lie at the beginning of your code

  let dataset = results;
  let groups: any[] = [];

  dataset.forEach((current: Result) => {
    const group = { volume: 0, items: [current] };
    dataset = dataset.filter(el => el.keyword !== current.keyword);

You're modifying dataset within the dataSet.forEach loop

I believe this should be

  //let dataset = results; remove this
  let groups: any[] = [];

  results.forEach((current: Result) => {
    const group = { volume: 0, items: [current] };
    const dataset = results.filter(el => el.keyword !== current.keyword);

Answer №3

To streamline this process, you can condense it down to a single use of the reduce function to prevent any unintended side-effects and eliminate confusion:

const result = data.reduce((accumulator, value, position) => {
  const matchingGroup = accumulator.findIndex(element => _.intersection(value.urls, element.urls).length > 3);
  
  if (matchingGroup !== -1 || matchingGroup === 0) {
    if (!accumulator[matchingGroup].searches.includes(value.keyword)) {
      accumulator[matchingGroup].searches.push(value.keyword);
      return accumulator;
    }
    
    return accumulator;
  } else {
    accumulator.push({
      mainKeyword: value.keyword,
      searches: [],
      urls: value.urls
    });
    
    return accumulator;
  }
}, []);
console.log(result.map(item => _.pick(item, ['mainKeyword', 'searches'])));

This current implementation doesn't take into account any counting or sorting logic, but that can easily be integrated into the existing code.

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

What is the most effective approach for preventing the inadvertent override of other bound functions on window.onresize?

As I delve deeper into JavaScript, I constantly find myself pondering various aspects of it. Take for instance the window.onresize event handler. If I were to use the following code: window.onresize = resize; function resize() { console.log("resize eve ...

Managing data flow in React and Reflux: Utilizing a single component duplicated in the DOM

Imagine this Tree scenario: <Homepage> <HeaderSection> <Navbar> <ShoppingCartComponent> </Navbar> </HeaderSection> <MainContent> <ShoppingCartComponent> &l ...

Preventing React setState from replacing the entire object

When attempting to update a customer's age using setState, the object is altered before calling setState, but the existing object is not updated. customerOnChange(event, field) { //Customer's age currently set at 80 var customer = { ...t ...

Guide on implementing a text display switch using CSS

I am working on two radio type menus to allow users to select different payment methods which will reveal corresponding information. However, I am facing a challenge in implementing the Precautions content below. Can someone guide me on how to achieve this ...

Can you explain the true meaning behind the phrase " '+movie['x']+' "?

Initially, I was supposed to pass this value into the function. However, it only worked once I started using quotations like this: " ' + some value + ' ". What could be the significance of this? ...

Refresh data on Table with AJAX and a repeated process

I am relatively new to Javascript and Ajax, so please bear with me as I explore... My goal is to update a table after inserting a new row. Instead of generating an entire HTML table through Ajax output, I would prefer to gather data from a PHP MySQL datab ...

Adding the <br> tag in a Bootstrap modal: a simple guide

Hi everyone! I am currently working with a bootstrap modal to create a login functionality within the modal itself. There is a data validation function in place. However, when an error occurs, the word 'br' displays directly in the HTML instead ...

Expanding the fields in passport.js local strategy

Passport.js typically only allows for username and password in its middleware as default. I am looking to include a third field in Passport.js. Specifically, I require username, email, and password to be utilized in my case. ...

Locate the closest available destination using Mongoose

I have multiple location coordinates stored in my MongoDB database, with some of them possibly being duplicated. My goal is to find the first nearest location based on longitude and latitude. Within my model, I defined the schema as follows: var mySchema ...

How to redirect to a different page within the same route using Node.js

When attempting to access the redirect on the login route using the same route, I first call the homeCtrl function. After this function successfully renders, I want to execute res.redirect('/login'). However, an error occurs: Error: Can't ...

Simulating Cordova plugin functionality during unit testing

I have a code snippet that I need to test in my controller: $scope.fbLogin = function() { console.log('Start FB login'); facebookConnectPlugin.login(["public_profile", "email", "user_friends"], FacebookServices.fbLoginSuccess, FacebookServic ...

Error: Prettier is expecting a semi-colon in .css files, but encountering an unexpected token

I'm currently attempting to implement Prettier with eslint and TypeScript. Upon running npm run prettier -- --list-different, I encountered an error in all of my css files stating SyntaxError: Unexpected token, expected ";". It seems like there might ...

ReactJS components enhanced with bootstrap-table JS extension

I recently downloaded the bootstrap-table package from NPM (npmjs.com) for my ReactJS application. It provides great features for setting up tables and datagrids. However, there are additional js and css files needed to enhance its functionality. These inc ...

Guide on incorporating a customized HTML tag into ckeditor5

Can someone help me with integrating CKEditor and inserting HTML tag of a clicked image into the editor? I've tried different solutions but haven't been successful. I understand that doing this directly in CKEditor may not be secure. This is a V ...

Avoid using unnecessary generic types while updating a TypeScript interface on DefinitelyTyped, especially when using DTSLint

After attempting to utilize a specific library (query-string), I realized that the 'parse' function was returning an any type. To address this, I decided to update the type definitions to include a generic. As a result, I forked the DefinitelyTy ...

Exploring the full potential of MongoDB with Discord.js through a single entry point

I'm currently facing an issue with a command that checks someone else's balance within my economy system. The problem is that it only stores one user's data at a time. So, when the database is empty and the bot tries to create a profile for ...

Revise my perspective on a modification in the backbone model

I am new to using Backbone and I am currently practicing by creating a blog using a JSON file that contains the necessary data. Everything seems to be working, although I know it might not be the best practice most of the time. However, there is one specif ...

Trouble with bootstrap 5 nested accordions: panels won't collapse as expected

I've structured my page content using nested bootstrap accordions within Bootstrap 5. The primary accordion is organized by continents, with each panel containing a secondary accordion for individual countries. While the main accordion functions cor ...

Using VueJS for interactive checkbox filtering

This marks the beginning of my VueJS project, as I transition from jQuery. My main goal is to filter a grid using multiple checkboxes. In the JavaScript code provided, there is a filterJobs function that filters an array based on the values checked (v-mode ...

Reassigning InputHTMLAttributes in TypeScript

After looking into this issue, I believe I may have a solution. I am exploring the possibility of overriding a React InputHTMLAttribute while using an interface within the context of styled-components. import { InputHTMLAttributes } from 'react' ...