Looking to extract website URLs from text and have a regex in place so far.
((http|https):\/\/)?(www\.)[-a-zA-Z0-9@:%._\+~#=]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9@:%_\+.~#?&//=]*)
Issue:
If www. is made optional, the regex ends up detecting strings like vue.js or something.ts as URLs and even identifies emails as web URLs. If www. is made mandatory, it fails to detect URLs like this.
The current regex works well for my requirements if I can enhance its flexibility to capture the mentioned URLs.
Query:
I aim to ascertain if the capture group containing http or https has been included in the regex. If the URL includes http, then make www. optional; otherwise, make it mandatory.
Any thoughts on potential solutions to address this challenge?