Saturday, 5 December 2020

Refactoring messy Test Suite in Python

Recently, I did some code reviews on my peer's code. We've created 100+ pySpark jobs and each job has its own test cases. When I looked at the file, it was super long. Let's take a look at main function. It is pretty general test suite main function. ``` if __name__ == "__main__": loader = unittest.TestLoader() suite = create_unit_suite() runner = unittest.TextTestRunner(verbosity=2) runner.run(suite) ``` However, when diving into create_unit_suite function, I was utterly dumbfounded. ``` def create_unit_suite(): suite = unittest.TestSuite() suite.addTests(loader.loadTestsFromModule(tests.test_job_1)) suite.addTests(loader.loadTestsFromModule(tests.test_job_2)) suite.addTests(loader.loadTestsFromModule(tests.test_job_3)) # ... suite.addTests(loader.loadTestsFromModule(tests.test_job_100)) suite.addTests(loader.loadTestsFromModule(tests.test_job_101)) suite.addTests(loader.loadTestsFromModule(tests.test_job_102)) # ... ``` And the import statements look like ``` import tests.test_job_1 import tests.test_job_2 import tests.test_job_3 # ... import tests.test_job_100 import tests.test_job_101 import tests.test_job_102 # ... ``` Every time a new job is created, the developers need to import it and and add a new addTests line. The folder structure is way more complicated and there is no guarantee that all developers remember to add the tests. You can't tell which jobs are missing test cases simply by looking at this file. Thanks to unittest, there is a function called ``discovery`` introduced in v3.2. It can be used like TestLoader.discover() or from the command line. From the documentation, it states that Unittest supports simple test discovery. In order to be compatible with test discovery, all of the test files must be modules or packages (including namespace packages) importable from the top-level directory of the project (this means that their filenames must be valid identifiers). In order to expose the test files, we need to make them modules. We can do that simply by adding __init__.py. Here's what's left. Specify the path and pattern for unittest to discover. No long import statements and addTests code. ``` import unittest from tests import * if __name__ == "__main__": testsuite = unittest.TestLoader().discover("tests", pattern="test_*.py") runner = unittest.TextTestRunner(verbosity=2).run(testsuite) ``` Simply run the below command to get the result. ``` python -m unittest ``` However, running all the test cases may take a great of time. In order to have more flexibility, it should be able to run in a specific module. We can use ``argparse`` to take ``--module`` flag to determine which module we are going to run test cases on. ``` import argparse ``` Put the logic in main() ``` def main(): parser = argparse.ArgumentParser() parser.add_argument("--module", action="store", default="all", help="the module of a collection of test cases to be executed") args = parser.parse_args() target_module = args.module target_path = "tests" if target_module == "all" else "tests/{}".format(target_module) testsuite = unittest.TestLoader().discover(target_path, pattern="test_*.py") runner = unittest.TextTestRunner(verbosity=2).run(testsuite) if __name__ == "__main__": main() ```

No comments:

Post a Comment

A Fun Problem - Math

# Problem Statement JATC's math teacher always gives the class some interesting math problems so that they don't get bored. Today t...