Cara menggunakan url regex validation javascript

Given a character of strings, we want to determine if the given string is a valid URL address. To do this in JavaScript would take many lines of code along with many conditional statements. Well, thank goodness we can get the job done with a little something called Regular Expressions. Regular Expressions or Regex for short, are a series of special characters that define a search pattern.

Summary

This is the Regex code snippet that is used to validate a URL: /^(https?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w \.-]*)*\/?$/. At first glance, this may look overwhelming, so let's breakdown the code into smaller chunks to better understand what is happening.

Table of Contents

Regex Components

The first thing you may notice about our snippet is that it is wrapped in (/), front and back. This way of creating a Regex is known as literal notation.

Anchors

In our snippet, the characters ^ and $ are both considered anchors. Anchors match the position before or after characters. The ^ anchor matches the beginning of text, in our case, /^(https) and similarly, the $ anchor matches the end of a text that precedes it, ie ([\/\w \.-]*)*\/?$/ we will come to that later. Our broken down Regex should look like this: /^$/ so far. The rest of our Regex will sit between the ^ and the $.

Quantifiers

Quantifiers set the limits of the string that your regex matches, and they are as followed:

  • * - 0 or more
  • + - 1 or more
  • ? - 0 or One
  • {n} - Exact Amount
  • {min, max} - Range of Amounts

In the first /1 of our Regex breakdown, we want to validate that a string's URL protocol begins with https:// or http://. To accomplish this, we can use the question mark quantifier after the (/2), /3 to validate that there is no (/2) in our pattern, or just one.

Grouping Constructs

We haven't discussed this yet, but as you may have guessed, we can group patterns together to further breakdown our string URL. The way we do this is with paranthesese /1, so let's put our code snippet into groups in the same way as we breakdown a URL string ((https://)(www.somename.co.us.)(com)(/stuff)). /6. Note: Like the (/7) after the (/2) made the (/2) optional, similar is to be said about (/7) outside of the grouped pattern ^1. The pattern in paranthesese are now optional.

Bracket Expressions

Similar to how we can group our pattern in paranthesese, we can also use bracket expressions to specify a range of characters that we want to match; for example. This would match all upper/lower case letters, ^2. Let's add some bracket expression to our code snippet. ^3. Lets discuss briefly what each group of code is doing while thinking about our URL breakdown. In our first group or our URL protocol: ^4, we are validating that a pattern starts with an optional string of https:// or http://. Next, the domain group: ^5 uses bracket expression to match lower case letters a - z, a period, and a hyphen while the (^6) quantifier validates that one or more of specified characters can be present, then the period validates that that group ends with a literal (^7). Moving on to the top-level-domain group: ^8, we set a bracket expression that matches let a-z and matches a (^7) along with a min/max quantifier that validates pattern is 2 or more characters but not more than 6. Finally, in the path group: $0. The bracket expression matches a slash, period, or a hyphen and that can match 0 or more times while the dollar sign ($) anchor validates pattern ends with an optional forward-slash.

Character Classes

With our code breakdown snippet starting to look somewhat similar to our URL validation code snippet, let's discuss another useful tool in Regex called Character Classes. Character Classes defines a set of characters, any one of which can occur in an input string to fulfill a match. We've had some exposure to character classes when we discussed bracket expressions, now let's show some common Character Classes:

  • ^7 - matches any character except the newline character (\n).
  • $3 - matches a digit (equal to $4)
  • $5 - matches any word character (equal to $6
  • $7 - matches a single whitespace character, including tabs and line breaks

Note: Each of the last three character classes can be changed to perform an inverse match by capitalizing the letter character. For example, \D matches a non-digit character.

Now we update our snippet breakdown with character classes: $8

Character Escapes

By now you are probably noticing some things that are contradictory in our breakdown snippet like the (/) or the (^7). How do we know if we want to match any character or just a literal (^7)?. To do this in Regex, we can escape characters with a (^2), so if wanted to match a literal (^7) we just put the back-slash before it (^4). Now that we are familiar with escaping special characters, let's update our snippet to escape any special characters to conclude our URL validation code snippet. ^5.

Author

Document was created by Demetri Dillard, a Jr. Developer and graduate of Trilogy Schools (Uninversity of Minnesota).