Lesson Tuesday

In the last lesson, we wrote a very basic function for counting words in a paragraph. It doesn't work very well. It doesn't care about the difference between numbers and words in a string. It returns 1 even when we pass in an empty string. So let's keep working on this function and make it better. We're ready to actually start building our application.

Create a directory called text-analyzer with two files: a README.md and a directory called js that includes a scripts.js file. Over the next several lessons, you'll follow along with building the text-analyzer application.

If you're wondering why there is no index.html file or css directory, it's because we aren't ready for them yet. Test-driven development is all about building your business logic first. It doesn't necessarily mean that all business logic has to be created before a user interface, as an application is constantly changing. However, if we start mingling business logic and user interface right now, we'll probably run into problems. We'll likely have issues with separation of logic. We might also end up with bugs in our code that are hard to track. For instance, if something isn't working, we might not be able to tell whether the error is in our business logic or user interface logic because everything is mixed up and we haven't tested it.

Here's our scripts.js file so far based on the code we covered in the last lesson:

scripts.js
// Business Logic

function wordCounter(text) {
  let wordCount = 0;
  const wordArray = text.split(" ");
  wordArray.forEach(function(word) {
    wordCount++;
  });
  return wordCount;
}

A quick reminder: as we stated in the last lesson, we are going back to using a loop because we need the practice. We've also added the comment // Business Logic to make it easier to see how our business and UI logic are separated because they are in the same file. We're just using this comment for clarity and organization - it's not something you'll see in real-world code bases. In Intermediate JavaScript, we'll learn to separate our code into different files and we won't need these comments anymore.

Here's what the tests in our README.md should look like so far:

Describe: wordCounter()

Test: "It should return 1 if a passage has just one word."
Code:
const text = "hello";
wordCounter(text);
Expected Output: 1

Test: "It should return 2 if a passage has two words."
Code:
const text = "hello there";
wordCounter(text);
Expected Output: 2

Note that this isn't a complete README. It's just the plain English tests. You are still responsible for adding any other README information for your projects.

Let's deal with the next simplest behavior. There are actually several different behaviors we could tackle next, so don't get too caught up on whether one behavior is simpler to implement than another if the distinction is not obvious. There is no preset route to building an application with TDD. Just do your best to implement one small behavior at a time.

We'll start by dealing with the fact our function will return 1 for an empty string. Here's the plain English test:

Test: "It should return 0 for an empty string."
Code: wordCounter("");
Expected Output: 0

Note that this test still belongs to the group of tests we've written so far, which means it doesn't have a separate line for Describe. Also, we've put the code inline because there's just one line of code. How you format your own plain English tests is up to you. Just make sure they are easy to read.

Now let's update the code. We encourage you to write the solution yourself.

scripts.js
// Business Logic

function wordCounter(text) {
  if (text.length === 0) {
    return 0;
  }
  let wordCount = 0;
  const wordArray = text.split(" ");
  wordArray.forEach(function(word) {
    wordCount++;
  });
  return wordCount;
}

All we need to do is write a conditional that checks if the length of text is equal to 0. If it is, the function will return 0. If we try wordCounter(""); in the console, it will return 0 as expected. So now we're ready to move on. Or are we? Think carefully. Have we solved the empty string problem fully yet?

Try this:

wordCounter("            ");

According to our function, that's 13 words. Not good. We could update our empty string test or write a new one. To be thorough and practice, we'll write a new test. Again, we are taking baby steps here.

Test: "It should return 0 for a string that is only spaces."
Code: wordCounter("            ");
Expected Output: 0

It's a quick fix:

scripts.js
// Business Logic

function wordCounter(text) {
  if (text.trim().length === 0) {
    return 0;
  }
  let wordCount = 0;
  const wordArray = text.split(" ");
  wordArray.forEach(function(word) {
    wordCount++;
  });
  return wordCount;
}

We can use String.prototype.trim() to trim all whitespace from both ends of a string. Since the string is all whitespace, that will reduce it to "", which has a length of 0.

Another test passing! These little details can lead to big bugs down the road if we don't think about them early on.

Next, let's think about words. What is a word exactly? We aren't going to enforce whether something is legally a word or not. And nor would Google Docs. If we were to write a fantasy novel with Xoeo and Myxtmidia as the main characters, Google Docs is more than happy to call those words. Google Docs also counts numbers as words but we are going to be more precise than that. A spelled number ("seven") is a word but the number 7 is a number so we won't add it in the word count. So let's get started with a test.

Test: "It should not count numbers as words."
Code: wordCounter("hi there 77 19");
Expected Output: 2

In our test, we mix together words and numbers. Our function should properly count the words but ignore the numbers in the count. If we actually test wordCounter("hi there 77 19"); in the console right now, it will return 4. That's not what we want. However, we will acknowledge that we could have characters named Epsilon72 and Eri9er in our upcoming science fiction novel. (We don't know how to pronounce Eri9er, but it's the future and they've figured that kind of thing out.)

So let's update the function to get this test passing.

Once again, see if you can do this on your own.

scripts.js
// Business Logic

function wordCounter(text) {
  if (text.trim().length === 0) {
    return 0;
  }
  let wordCount = 0;
  const wordArray = text.split(" ");
  wordArray.forEach(function(element) {
    if (!Number(element)) {
      wordCount++;
    }
  });
  return wordCount;
}

The change above will get the result we want. All we have to do is add a conditional. Number() either returns a number or NaN. Number("16") will return 16 while Number("hi") returns NaN. So if something's not a number (NaN), we will increment our wordCount. If it is a number, we won't increment it.

Note also that we changed the callback function's parameter to element because it might not always be a word. It's a good practice to rename variable names if you realize they more accurately communicate your code.

Once again, this was a pretty small change. It's much easier to do this incrementally.

There's another thing we haven't thought about: punctuation. However, this is easier to handle with a regular expression. We haven't learned about regular expressions yet and they are a further exploration, so don't worry about them right now. We could do even more to make this function robust, but from the purposes of demonstrating how to use TDD practices, this is robust enough.

Writing and Testing a Second Function

Let's move on and get a little more practice by writing a second function. This one will also use a loop as well. The purpose of this function will be to determine how many times a specific word occurs in a passage. We'll call this function... um... wordCounter()? No. We've got to be clear with our code and that's already taken. We'll call it numberOfOccurrencesInText(). The name is a bit lengthy but this function states exactly what it does which will help us communicate with other developers.

Let's say we have the following passage of text:

"red blue red red green red"

If we ask our function how many times the color red occurs, it should correctly return 4.

However, that is not our first test. We can start even smaller than that. We can start with how many times a word occurs in an empty string, which should be 0 no matter what. That's probably as small as we can go.

We'll want to start a new group of tests for this function, which means a new Describe block. We can add this test below the other tests in our README.

Describe: numberOfOccurrencesInText()
Test: "It should return 0 occurrences of a word for an empty string."
Code:
const text = "";
const word = "red";
numberOfOccurrencesInText(word, text);
Expected Output: 0

Once again, this basic test can really help us get started. Let's take a look.

scripts.js
// Business Logic

// wordCounter() function omitted for brevity.

function numberOfOccurrencesInText(word, text) {
  return 0;
}

We add the numberOfOccurrencesInText() function beneath our wordCounter() function. See how nicely our business logic is coming together because we aren't thinking about the UI? When we start working on our user interface logic, it will be much easier to keep things separate.

While our function is basic so far, it allows us to establish a couple of key things. First, this function needs two parameters, one for the word we want to find and one for the text itself. Secondly, just like with our wordCounter() function, it will return a number.

Next, let's see what happens when we are searching text that is just one word.

Test: "It should return 1 occurrence of a word when the word and the text are the same."
Code:
const text = "red";
const word = "red";
numberOfOccurrencesInText(word, text);
Expected Output: 1

Let's update our function. Once again, we are aiming to keep it as simple as possible. It's okay if it looks nothing like the final product yet. We are just taking baby steps.

scripts.js
function numberOfOccurrencesInText(word, text) {
  if (word === text) {
    return 1;
  }
  return 0;
}

We add a simple conditional. If the word equals the text, we should return 1. Otherwise, we should return 0. Very simple. Both tests will pass now.

Are we ready to move onto multiple words? Well, we should verify that it doesn't return a match if the word and the text aren't the same first.

Here's the test:

Test: "It should return 0 occurrences of a word when the word and the text are different."
Code:
const text = "red";
const word = "blue";
numberOfOccurrencesInText(word, text);
Expected Output: 0

This test will pass already so you might wonder what the point is. Well, first of all, it's always good to verify, not assume. You don't ever want to tell the team lead or your boss that you assumed something would work when everything goes terribly awry. Also, with automated testing, we might find later in the process that something breaks this specific test while all of our other tests pass correctly. Then we could more easily go back and fix the issue.

Let's move onto multiple words.

Test: "It should return the number of occurrences of a word."
Code:
const text = "red blue red red red green";
const word = "red";
numberOfOccurrencesInText(word, text);
Expected Output: 4

You might be wondering why we are moving up to so many words and occurrences already. Why not just move up to two words first? Well, they should work exactly the same - and we are less likely to get a false positive. On the other hand, our function already returns 1 sometimes - if we just have two words and one of them is red, well, our code may return the right answer even if it's broken - just as a broken clock is right twice a day.

Let's update our code to get our new test passing:

scripts.js
function numberOfOccurrencesInText(word, text) {
  const wordArray = text.split(" ");
  let wordCount = 0;
  wordArray.forEach(function(element) {
    if (word === element) {
      wordCount++
    }
  });
  return wordCount;
}

This doesn't look very different from our previous code - but we actually modified the conditional from our previous test to use within our loop. Instead of having to write all the code at once, we got a sense of what our parameters and return argument should look like and we also got a good start on our conditional.

Once again, we split the text passage into an array and create a wordCount that starts at 0. We loop through this array, and if the word we've passed into our function is equal to the element in wordArray, we've found an instance of the word and we can increment wordCount by one. Finally, we return wordCount.

It works! Yay! Except... well, we probably won't notice this with manual testing, but with Jest, we'd get a bright red failure for our first test. In the process of updating the code, this test no longer works:

Describe: numberOfOccurrencesInText()

Test: "It should return 0 occurrences of a word for an empty string."
Code:
const text = "";
const word = "red";
numberOfOccurrencesInText(word, text);
Expected Output: 0

It's going to return 1 again. That's because our function no longer returns 0. This kind of thing happens all the time when updating code. You get one thing working but you break another thing in the process. And if you're not testing, you probably won't even notice until there's a bug or the application is on fire (figuratively). Not good. Automated testing catches this stuff.

So let's update the function to get that test passing again:

scripts.js
function numberOfOccurrencesInText(word, text) {
  if (text.trim().length === 0) {
    return 0;
  }
  const wordArray = text.split(" ");
  let wordCount = 0;
  wordArray.forEach(function(element) {
    if (word === element) {
      wordCount++;
    }
  });
  return wordCount;
}

Now everything is good to go. But what about...

"Red RED red"

We need to account for upper and lowercase. "Red" and "red" are still the same word - but our function will not recognize this. Once again, let's start with a test.

Test: "It should return a word match regardless of case."
Code:
const text = "red RED Red green Green GREEN";
const word = "Red";
numberOfOccurrencesInText(word, text);
Expected Output: 3

Note that our test will be a bit more thorough because we are also changing the case of the word variable. It should be evident here that we need to do something that makes both the word and all instances of that word in the text variable consistent, such as lower-casing them. If the words that are being compared are different cases, our function won't see them as a match.

Try getting the test passing on your own first. The passing code is below:

scripts.js
function numberOfOccurrencesInText(word, text) {
  if (text.trim().length === 0) {
    return 0;
  }
  const wordArray = text.split(" ");
  let wordCount = 0;
  wordArray.forEach(function(element) {
    if (word.toLowerCase() === element.toLowerCase()) {
      wordCount++;
    }
  });
  return wordCount;
}

As we can see, we just need to call String.prototype.toLowerCase() on both word and element to get the test passing.

This is one of those tests where we really do need to think carefully about what we are testing and how we can make sure that our function works as expected. It would be easy to write a test that just lowercases text and doesn't take account of the fact that a user might type in "RED" instead of "red". Fortunately, our test accounts for both the case of the parameter and the case of each element in the text.

It may also be tempting to just lowercase a user's input in the user interface section of the code - similarly to how we've used parseInt() to make sure that a number input on a form is converted from a string to a number. However, this wouldn't be good separation of logic. Remember, it's our function's job to correctly analyze any strings it receives. If we did that in the UI instead, it would be harder to test, harder to track, and more prone to bugs. Our function would also be less resilient and reusable.

Let's move onto our next test. Can you think of anything else that still needs to be tested? Are there any other situations where our function won't correctly compute a matching string when it should?

What about this string?

"Red! Red. I like red, don't you?"

If we split this string by spaces, we'll get the following array:

["Red!", "Red.", "I", "like", "red,", "don't", "you?"]

Well, "red" should match "red." Currently, though, it won't. So let's write a test.

Test: "It should return a word match regardless of punctuation."
Code:
const text = "Red! Red. I like red, green, and yellow.";
const word = "Red";
numberOfOccurrencesInText(word, text);
Expected Output: 3

Now let's get our test passing. There are several ways which we can solve this problem.

  • One way is to use the method String.prototype.includes(), which checks to see if a string includes another string or character. We're going to use this approach. String.prototype.includes() is a very handy string method and one which you'll likely use multiple times throughout this section.

  • The other approach is to use a regular expression. We are going to cover this approach in a further exploration lesson on regular expressions. Furter exploration means it's not required to learn about regular expressions and you won't need to use them this section or on the independent project, though you can experiment with them if you like.

Let's solve the problem using String.prototype.includes(). First, let's see what this method actually does.

String.prototype.includes() returns a boolean. If a string contains another string, the method will return true. For instance, the string "epicodus" contains the string "epic". If the string doesn't include the substring, the method will return false. We can do something like this:

function includesRarestLetter(word) {
  if (word.toLowerCase().includes("q")) {
    return true;
  }
  return false;
}

Q is the rarest letter in the English alphabet and this function checks whether a word contains the letter, returning true if it does and false otherwise.

We can also use String.prototype.includes() with longer strings as well - such as checking whether a substring includes "red":

"red! red. red?".includes("red");
true

Let's update our function to get our newest test passing:

function numberOfOccurrencesInText(word, text) {
  if (text.trim().length === 0) {
    return 0;
  }
  const wordArray = text.split(" ");
  let wordCount = 0;
  wordArray.forEach(function(element) {
    if (element.toLowerCase().includes(word.toLowerCase())) {
      wordCount++;
    }
  });
  return wordCount;
}

We've updated our conditional to check if the following is true:

element.toLowerCase().includes(word.toLowerCase())

So if an element in the text array (such as "red.") includes the word we are searching for ("red"), wordCount will be incremented - and our test will pass!

String.prototype.includes() is a very helpful method. There is a problem, though:

"redo".includes("red");
true

Yes, the word "redo" contains the word "red" - but it's not an occurrence of the word. We aren't going to worry about this issue, though you are welcome to refactor the application on your own to fix this with an additional test.

So our numberOfOccurrencesInText() function isn't perfect but that's okay. Once again, the main purpose here is to learn about test-driven development - how it works, how to apply it, and how to write plain English tests to gradually build up robust functions and solve problems. For the next several sections, you will use this TDD approach with plain English tests. As we've mentioned before, in Test Driven Development and Environments with JavaScript, we will start using Jest for our tests.

Tests

Here are all the tests we wrote in this lesson. They should provide a good sense both of what a plain English test should look like and a general progression from simplest behavior to more complex behavior.

Describe: wordCounter()

Test: "It should return 1 if a passage has just one word."
Code:
const text = "hello";
wordCounter(text);
Expected Output: 1

Test: "It should return 2 if a passage has two words."
Code:
const text = "hello there";
wordCounter(text);
Expected Output: 2

Test: "It should return 0 for an empty string."
Code: wordCounter("");
Expected Output: 0

Test: "It should return 0 for a string that is only spaces."
Code: wordCounter("            ");
Expected Output: 0

Test: "It should not count numbers as words."
Code: wordCounter("hi there 77 19");
Expected Output: 2


Describe: numberOfOccurrencesInText()

Test: "It should return 0 occurrences of a word for an empty string."
Code:
const text = "";
const word = "red";
numberOfOccurrencesInText(word, text);
Expected Output: 0

Test: "It should return 1 occurrence of a word when the word and the text are the same."
Code:
const text = "red";
const word = "red";
numberOfOccurrencesInText(word, text);
Expected Output: 1

Test: "It should return 0 occurrences of a word when the word and the text are different."
Code:
const text = "red";
const word = "blue";
numberOfOccurrencesInText(word, text);
Expected Output: 0

Test: "It should return the number of occurrences of a word."
Code:
const text = "red blue red red red green";
const word = "red";
numberOfOccurrencesInText(word, text);
Expected Output: 4

Test: "It should return a word match regardless of case."
Code:
const text = "red RED Red green Green GREEN";
const word = "Red";
numberOfOccurrencesInText(word, text);
Expected Output: 3

Test: "It should return a word match regardless of punctuation."
Code:
const text = "Red! Red. I like red, green, and yellow.";
const word = "Red";
numberOfOccurrencesInText(word, text);
Expected Output: 3

Lesson 19 of 38
Last updated more than 3 months ago.