Code Cleanup
There are currently several outstanding issues with the code style and layout in Autotest that are a continuing source of problems. While none of these issues are immediate bugs, they are currently a source of maintenance problems.
Tabs vs Spaces
Overview
Currently our coding standard is to use hard tabs for indentation, equivalent to 8 spaces. In the long-term we'd like to move to a more "standard" indentation, using 4 spaces to indent. There are a few specific issues with doing this:
Revision control system noise: a mass tab->space change will generate "do-nothing" revisions, and will reduce the usefulness of command like "blame"
Labour-intensive: we can automate the conversion of leading tabs -> four spaces, but this won't properly lay out multi-line commands
- Conflicts: when the conversion happens, any in-flight changes that aren't currently checked in will cause massive conflicts.
However, none of these problems should really be show stoppers. The revision control noise is an irritation, but if we perform the conversion in one shot then it should isolate the noise to a single revision and the problem with commands like blame can be worked around (e.g. if svn blame tells you the line was last touched by the tab->space revision, then use the -r option to get the blame from just before the whitespace conversion). Since there has also been consideration of moving from Subversion to a different revision control system the revision noise might end up being a moot point anyway.
The amount of work required for a conversion shouldn't be a huge problem. If it turns out to be too much of a problem to properly clean up the multi-line statements, that work could just be left for a later date; this is similar to what was done with standards such as 80-character lines, which wasn't enforced in older code, but wasn't cleaned up en masse when the standard was set, either. This would probably even help with the third issue, conflicts; if the mass conversion is entirely automated then we can simply provide a conversion script that people with in-flight changes can run on their edited files, which should take care of any conflicts.
What needs to be done
- Write up a script to automate the conversion process for a single Python file.
Pick a date for the conversion, and make sure everyone likely to be doing work on Autotest is aware of it. This date should be chosen after consultation with the people on the Autotest mailing list.
- On the date of the conversion, apply the conversion script to all of the Python code in Autotest.
- Test the converted code.
- Commit the conversion to the repository. This should probably be done directly, rather than mailing a massive patch to the external list.
- Notify the mailing list, and make sure that everyone on it is provided with any scripts/instructions that can be used to ease the resolution of conflicts with their in-progress changes.
Relative vs Absolute imports
A more serious style problem in Autotest has been the inconsistent importing style. Imports are usually "relative", relying on the modules being implicitly found in the same directory as the code that's performing the import, but with a variety of absolute imports mixed in, with sys.path manipulations. These have mostly been a side effect of code reuse; for example code in /server trying to make use of code in /client/bin. There were previous attempts to alleviate the problem by creating a common library in client/common_lib, but over time this solution hasn't really resolved the problems.
In order to work around these problems, changes were recently made to the common.py files that can be found in (most of) the Autotest directories. These files will set up a module namespace under autotest_lib, rooted at the Autotest directory (or in the case of a standalone client, autotest_lib.client rooted at the client directory). The means that all we need to do is have every entry point (i.e. any file being directly executed) perform an "import common" before any autotest imports, and then all autotest imports can be done with code such as "from autotest_lib.server import kernel" to import the server/kernel.py module. These changes are already in place and being used by some of the Autotest code, so all that remains is to apply these changes to import to any remaining existing code that still uses relative imports.
At this point the conversion has been proceeding on a file-by-file basis, converting imports as we go (or whenever a problem comes up with the existing imports). It be a good idea to at some point just do a "final sweep" where we go through and change any remaining relative imports to absolute ones, since a "change as we go" approach will only get us so far along (rarely changed files will take a long time to ever be updated).
When doing an absolute -> relative conversion, the import things to remember are:
- Make sure "import common" occurs at your entry points (don't scatter it all throughout the code "just in case").
- Don't mix-and-match; if you're going to use the absolute imports, convert all of the imports in the file. Mixing imports has on occasion let to problems where the relative and absolute imports collided with eachother.
Use "from" to import specific modules, instead of importing something like "autotest_lib.server.utils" and then having to prepend every function call with three module names (i.e. do from autotest_lib.server import utils, NOT from autotest_lib.server.utils import *). If there are naming conflicts (e.g. between the common lib utils and the server utils) then you can make use of "from ... import ... as ..." to use aliases.
Common library refactorings
Over the past few months a good job of moving duplicated code from client+server->common_lib has been done, but there's still some room for improvement. Most of the obvious low-hanging fruit (e.g. commonly usable utils functions) have been moved, and most of the remaining ones are either somewhat client/server specific, or at least easy enough to move when there's a demand; actual duplication is now at a minimum there. Still, there is some room for improvement. Some rather obvious areas where code is duplicated are:
- client/bin/job.py versus server/server_job.py
- client/bin/parallel.py versus server/subcommand.py
- client/bin/kernel.py versus the server/*_kernel.py files
Actually pulling common functionality out of these files into the common library will be somewhat difficult; while there's quite a bit of duplication between these files, there's also a lot of customization and client/server-specific code. Still, any code that could be pulled into the common library would be an improvement; like with client/test.py and server/test.py, there tends to be a lot of times when bugfixes/enhancements that aren't specific to the client or server but they don't make it into both sets of code, often leading to bugs having to be fixed twice, or the server/client having needlessly mis-matched capabilities.
