CS 41: Hap.py Code is a course at Stanford about the Python programming language. This autograder, implemented entirely in Python allows the user to run student code, run solution code, and compare the output of the two. The autograder allows the user to implement several tests which can be executed concurrently and allows the user to hook into the module and provide logic to post-process the results of the tests.
A sample run of the CS 41 autograder.
Here are steps to getting off the ground with this autograder:
- Instantiate
autograder.Autograder. Create an object that inherits fromautograder.Autograderlikeclass TestAutograder(Autograder):. - Initialize the parent instance. I tend to do this inside the
__init__method of my subclass withsuper().__init__. The only required argument to that function is the name of the student module, provided as a stringmodule_name. If you'd like to add to the autograder, specifyhas_custom_tests=True. - Write
run_custom_tests. Override therun_custom_tests(self)function and add any custom tests. When that function is executed,self.modulewill contain the student module object.
The autograder module has numerous tests which inherit from each other in a linear hierarchy.
BaseTest: TheBaseTestmust be provided thestudent_objand thesolution_obj. It simply compares the two objects and passes if they are the same and fails otherwise.ArgTest:ArgTestcan additionally be provided withargsandkwargs. The autograder runs the functions in a sandboxed environment and compares their return values, output, and any errors they threw.IOTest: TheIOTestallows the autograder to overwritesys.stdinand provide input to the student and solution programs when they callinput. The text inputs should be provided asin_params.FileIOTest: AFileIOTestis provided afilenameand generates anIOTestfrom the contents of that file.
autograder.testsuite contains a class called TestSuite. This class allows the user to add several tests to the autograder, run them concurrently, and tabulate the results. You can enable concurrency by passing multiprocess=True to the constructor of the TestSuite. You can also hook into the test suite using a machine learning algorithm by passing in a function as the argument ml. After all tests have finished, ml will be called with a list of ones and zeros where the ith entry corresponds to the ith test (one indicates that the student passed the test and zero indicates that the student failed).
The autograder supports a module_overrides argument that should be a dictionary mapping strings to objects. The autograder will override the associated mappings at the module level within the student file.
Each test supports a setup_fn and a cleanup_fn that will be called before and after the test runs, respectively. These functions can be used to modify the filesystem and inputs or otherwise clean up before and after the test runs
If the autograder is called with a --progressive or -p flag at the command line, it will stop when it hits the first output error in each program. It will prompt the grader to enter either PRIOR, SUBSEQ, or BOTH which will display the prior lines, subsequent lines, or display the entire diff, respectively.
Module overrides and the progressive diff features do not work in multiprocessing mode. The progressive diff feature cannot be repaired because the OS restricts access to sys.stdin so the autograder can't ask the grader for input. It is possible to repair the issue with module overrides, but that will require significant refactoring.
