In this case study we use detailed data from four software industry projects with overall 264 months of development and 1.1 million SLoC to compare six different metrics and their suitability for productivity benchmarking on development output level. Code change, absolute growth and number of commits as well as invested effort are measured in consecutive 3-month periods. This allows us to observe alterations in productivity throughout the course of a project as well as inter-project comparisons. We find correlations between effort and the chosen output metrics as well as significant and explainable productivity differences between projects and project phases. We also analyze whether the use of a clone detection algorithm can improve measurement by adjusting for copy & paste additions and renamed or moved code, and find that a small benefit exists. The redundancy-adjusted amount of code tokens added or modified seems to be the best metric among the selected, in particular in ongoing development where an already existing codebase is changed. Number of commits and absolute growth may complement the picture.