Outside of performance-focused domains, research software is typically designed with output in mind rather than runtime efficiency. So, the resulting software consumes more resources (time, hardware) and is less scalable, hindering larger or longitudinal studies without adaptations. In this paper, we report our experiences of iteratively identifying and optimizing performance bottlenecks to enable such analyses in an established research software. Specifically, we applied a top-down strategy to ToolX, an architecture-smell detection tool, to develop a tool (ToolY) for tracing architecture smells through software evolution. To identify performance bottlenecks and benchmark our improvements, we used the Qualitas Corpus and a custom dataset. We achieved a reduction in processing time of approx. 98 % and reduced the runtime complexity from almost quadratic to close-to-linear. By sharing our process and insights, we hope to guide researchers in optimizing their research software in the future.