Regular Expressions are Awesome

Ever since I began programming in Ruby and was exposed to regular expressions, I’ve always loved using them for string matching. They are incredibly flexible and allow you to easily handle edge cases without ugly conditions or imperative code within loops.

For instance, say we’re looking to parse out hours and minutes from a string that looks like: 2:54 pm+1, 12:02 am or 12 am. If it wasn’t for the last example, we could do something like the following:

function getHours(timeString) {
  return parseInt(timeString.split(":")[0]);
}

function getMinutes(timeString) {
  return parseInt(timeString.split(":")[1].slice(0, 2));
}

But, since we may not have minutes or a : symbol, this doesn’t work for the last example. If we continue without using regular expressions, we would have to handle this edge case. It might look something like:

function getHours(timeString) {
  let hours = 0;
  for (let i = 0; timeString.length; i++) {
    if (timeString[i] === ":" || timeString[i] === " ") break;
    hours *= 10 ** i;
    hours += parseInt(timeString[i]);
  }
  return hours;
}

function getMinutes(timeString) {
  let minutes = 0;
  let colonSeen = false;
  for (let i = 0; i < timeString.length; i++) {
    if (timeString[i] === ":") {
      colonSeen = true;
      continue;
    }
    if (!colonSeen) continue;
    if (timeString[i] === " ") break;
    minutes = minutes * 10 ** (Math.floor(minutes / 10) + 1);
    minutes += parseInt(timeString[i]);
  }
  return minutes;
}

This isn’t the easiest way to do this necessarily, we could also check if there’s a : first before iterating:

function getMinutes(timeString) {
  const colonLocation = timeString.indexOf(":");
  if (colonLocation === -1) return 0;
  let minutes = 0;

  for (let i = colonLocation + 1; i < timeString.length; i++) {
    if (timeString[i] === " ") break;
    minutes = minutes * 10 ** (Math.floor(minutes / 10) + 1);
    minutes += parseInt(timeString[i]);
  }

  return minutes;
}

So, this is a bit simpler. We could also use recursion to simplify the logic. But, the point is, parsing out sections of a string can be a bit rudimentary. Instead, let’s approach this with regular expressions.

const HOUR_MATCHER = /^\d+/;
function getHours(timeString) {
  const match = timeString.match(HOUR_MATCHER);
  return match ? parseInt(match[0]) : 0;
}

const MINUTE_MATCHER = /(?<=:)\d+/;
function getMinutes(timeString) {
  const match = timeString.match(MINUTE_MATCHER);
  return match ? parseInt(match[0]) : 0;
}

This is so much cleaner and it handles the case where there are no minutes at all.

Regular expressions are written between slashes like such: /<regular-expression-here>/. Then we can use the match method on strings to match a regular expression against it. If there's no match, it will return null. Otherwise, it will return an object and the first match will be in the 0 property (similar to how we would access an array, but it's actually an object with a key of 0).

In getHours, we're using the ^ character to latch the regular expression to the beginning of the string. Then we're matching \d, which are digits, and then using the + character to match 1 or more of them. If there is no match, we'll return 0 hours. But, it would probably be better to throw an error because the timeString is in an unexpected format

In getMinutes, we're using a positive look behind (?<=:), which checks if there's a matching pattern before the next pattern. In this case, we'll match if there's a : symbol. Then, like before, we're matching 1 or more digits with \d+. This will stop matching once there are no more subsequent digits (such as a space or end of the string). If the positive look behind isn’t hit (there’s no colon), the match returns null and we'll return 0 minutes.

Regular expressions might look a bit cryptic at first, but once you get the basics down, they're actually quite simple. If you're learning to use regular expression, using a regular expression tester is helpful when building regular expressions to test them out against various cases to ensure it's working correctly.

I’ve always found regular expressions to be much more concise, flexible and eloquent than manual parsing algorithms based on some pattern. If you haven’t used regular expressions, you should definitely check them out and add them to your toolbox. They're a crucial tool I’ve used countless times and can help simplify your code.