Regex struggles to identify words containing foreign characters

Here is a method I have created to check if a user-input term matches any blacklisted terms:

  static checkAgainstBlacklist(blacklistTerms, term) {
    return blacklistTerms.some(word =>
      (new RegExp(`\\b${word}\\b`, 'i')).test(term)
    );
  }

I'm encountering an issue with words containing special characters:

  it('should return true if sentence contains blacklisted term',
    inject([BlacklistService], () => {
      const blacklistTerms = [
        'scat',
        'spic',
        'forbanna',
        'olla',
        'satan',
        'götverenlerden',
        '你它马的',
        '幼児性愛者',
      ];
      expect(BlacklistService.checkAgainstBlacklist(blacklistTerms, 'scat')).toEqual(true);
      expect(BlacklistService.checkAgainstBlacklist(blacklistTerms, 'scat-website')).toEqual(true);
      expect(BlacklistService.checkAgainstBlacklist(blacklistTerms, 'spic')).toEqual(true);
      expect(BlacklistService.checkAgainstBlacklist(blacklistTerms, 'website-spic')).toEqual(true);
      expect(BlacklistService.checkAgainstBlacklist(blacklistTerms, 'forbanna')).toEqual(true);
      expect(BlacklistService.checkAgainstBlacklist(blacklistTerms, 'olla')).toEqual(true);
      expect(BlacklistService.checkAgainstBlacklist(blacklistTerms, 'satan-website')).toEqual(true);
      expect(BlacklistService.checkAgainstBlacklist(blacklistTerms, 'götverenlerden')).toEqual(true);
      expect(BlacklistService.checkAgainstBlacklist(blacklistTerms, '你它马的')).toEqual(true);
      expect(BlacklistService.checkAgainstBlacklist(blacklistTerms, '幼児性愛者')).toEqual(true);
    })
  );

While most tests pass, these three are failing:

      expect(BlacklistService.checkAgainstBlacklist(blacklistTerms, 'götverenlerden')).toEqual(true);
      expect(BlacklistService.checkAgainstBlacklist(blacklistTerms, '你它马的')).toEqual(true);
      expect(BlacklistService.checkAgainstBlacklist(blacklistTerms, '幼児性愛者')).toEqual(true);

What adjustments can I make to my regex command to handle these terms correctly?

Answer №1

\b is designed to match word boundaries, but the issue lies in the fact that the characters at the beginning and end of the failing strings (apart from the last n in götverenlerden) are not considered word characters, causing the pattern to fail.

To rectify this, one can modify the regex by either matching the start/end of the string or utilizing separator characters such as spaces and punctuation:

static verifyAgainstBlacklist(blacklistedTerms, term) {
  return blacklistedTerms.some(word =>
    (new RegExp(
      String.raw`(?:^|(?!\w)[\u0000-\u007f])${word}(?:$|(?!\w)[\u0000-\u007f])`,
      'im'
    )).test(term)
  );
}

The utilized character codes can be viewed here. Essentially, (?!\w)[\u0000-\u007f] matches any character between character code 0 and 255 that does not fall within the range of 0-9, A-Z, a-z, or _.

This generates a pattern as follows:

(?:^|(?!\w)[\u0000-\u007f])götverenlerden(?:$|(?!\w)[\u0000-\u007f])

Further details can be found at https://regex101.com/r/AHRVgA/1

Another approach involves splitting the input string by separator characters (such as punctuation, spaces, etc., consistent with the aforementioned pattern) and then verifying if any resulting words are present in the blacklistedTerms.

However, regardless of the algorithm used for your automated filter, users often find inventive ways to bypass these restrictions while conveying their message.

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

Save the entire compiler output as a text or CSV file by using the "strict":true compiler option in TypeScript

The tsconfig.json file in my Visual Studio project includes the following settings: { "CompileOnSave":false, "CompilerOptions":{ "strict": true, "skipLibCheck":true }, "angularCompilerOptions":{ "fullT ...

Module 'serviceAccountKey.json' could not be located

I'm encountering an issue while trying to incorporate Firebase Functions into my project. The problem lies in importing the service account key from my project. Here is a snippet of my code: import * as admin from 'firebase-admin'; var ser ...

Update the data and paginator status

In my development project, I have implemented PrimeNG Turbotable with the code <p-table #dt. Based on information from here, using dt.reset() will clear the sort, filter, and paginator settings. However, I am looking for a solution to only reset the pa ...

Navigating horizontally to find a particular element

I developed a unique Angular component resembling a tree structure. The design includes multiple branches and nodes in alternating colors, with the selected node marked by a blue dot. https://i.stack.imgur.com/fChWu.png Key features to note: The tree&ap ...

The presence of a constructor in a component disrupts the connection between React and Redux in

I am facing an issue with the connect function from 'react-redux' in my Typescript class example. The error occurs at the last line and I'm struggling to understand why it's happening. The constructor is necessary for other parts of the ...

Setting the TypeScript version while initializing CDK

When creating a new CDK app in typescript, I typically use the following command: npx --yes <a href="/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="d9babdb299e8f7e8eae1f7eb">[email protected]</a> init app --language typesc ...

Can you explain the significance of tslint's message: "Warning: The 'no-use-before-declare' rule necessitates type information"?

Can someone explain the significance of tslint's "no-use-before-declare" rule warning, which states that it requires type information? I've tried researching online but still don't fully understand its implications. ...

What is the best way to update the color of a label in a Mantine component?

When using the Mantine library, defining a checkbox is done like this: <Checkbox value="react" label="React"/> While it's possible to change the color of the checkbox itself, figuring out how to change the color of the label ...

What could be causing the "no exported member" errors to appear when trying to update Angular?

The dilemma I'm facing a challenge while attempting to upgrade from Angular V9 to V11. Here are the errors that I am encountering: Namespace node_module/@angular/core/core has no exported member ɵɵFactoryDeclaration Namespace node_module/@angular/ ...

Highlighting the home page in the navigation menu even when on a subroute such as blog/post in the next.js framework

After creating a navigation component in Next JS and framer-motion to emphasize the current page, I encountered an issue. The problem arises when navigating to a sub route like 'localhost:3000/blog/post', where the home tab remains highlighted i ...

The HTML file that was typically generated by Webpack is now missing from the output

While working on my nodejs app using typescript, react, and webpack, everything was running smoothly. I was getting the expected output - an HTML file along with the bundle file. However, out of nowhere and without making any changes to my code, I noticed ...

Exploring the power of Angular 10 components

Angular version 10 has left me feeling bewildered. Let's explore a few scenarios: Scenario 1: If I create AComponent.ts/html/css without an A.module.ts, should I declare and export it in app.module.ts? Can another module 'B' use the 'A ...

Efficiently Measuring the Visibility Area of DetailsList in Fluent UI

I am currently utilizing the DetalisList component alongside the ScrollablePane to ensure that the header remains fixed while allowing the rows to scroll. However, I have encountered a challenge as I need to manually specify the height of the scrollable co ...

I encountered TS2345 error: The argument type X cannot be assigned to the parameter type Y

Currently, I am delving into the world of Angular 8 as a beginner with this framework. In my attempt to design a new user interface with additional elements, I encountered an unexpected linting error after smoothly adding the first two fields. The error m ...

Learn how to capture complete stack traces for errors when using Google Cloud Functions

In the codebase I am currently working on, I came across a backend service that I would like to utilize for logging all errors along with their corresponding Http statuses. If possible, I also want to retrieve the full stack trace of these errors from this ...

Angular 5 does not recognize the function submitEl.triggerEventHandler, resulting in an error

Greetings! I am currently working on writing unit test cases in Angular5 Jasmine related to submitting a form. Below is the structure of my form: <form *ngIf="formResetToggle" class="form-horizontal" name="scopesEditorForm" #f="ngForm" novalidate (ngSu ...

Angular2 - adding the authentication token to request headers

Within my Angular 2 application, I am faced with the task of authenticating every request by including a token in the header. A service has been set up to handle the creation of request headers and insertion of the token. The dilemma arises from the fact t ...

Enhancing Code Functionality with TypeScript Overload Methods

I've encountered an issue with a code snippet that has a method with 2 overloads: /** * Returns all keys of object that have specific value: * @example * KeysOfType<{a:1, b:2, c:1}, 1> == 'a' | 'c' */ type KeysOfType<M ...

How to effectively handle null values using try..catch statement in typescript

As a beginner, I am learning how to write a try/catch statement in TypeScript. My issue is that there is a function within the "try" block that returns null. How can I implement code in the "catch" block specifically for when the function in "try" returns ...

What is the process for attaching an analytics tag to data messages using the Firebase Admin SDK with Javascript or TypeScript?

Adding a label to my message is something I'm trying to do. I checked out the official guidelines here and found a similar question answered on Stack Overflow here. I've been attempting to implement this in JavaScript, but I'm stuck. Here& ...