Comprehension Challenges at the Level of Software Ecosystems and Global Software Engineering
In this talk, I discuss program comprehension-related work, technologies, and challenges at Facebook from the perspective of a software engineer and researcher operating in the context of developer infrastructure. This discussion will cover notably ownership management , work-item prediction , but also other areas in passing (e.g., code search and indexing, program generation, and data warehousing).
Ownership management – understand what is best owned by whom. While the standard processes ensure that all products and product features are owned in an accountable manner also extending to the underlying assets (tables, code artifacts, etc.), new forms of automation can help to suggest more suitable owners for any given asset at a given point in time. Who is the most suitable indeed changes over time, e.g., due to reorganization and individual function changes. By such efforts on ownership health, accountability of ownership is further increased.
Work-item prediction – understand who is working on what when. Understanding what a software engineer (a developer, a production engineer, a test engineer, a data scientist, etc.) is working on is a challenging problem – especially when considering complex software engineering workflows: a) engineers use a myriad of loosely integrated tools coming, going, and evolving all the time; b) engineers do much context switching between different ‘work items’ (such as diffs for system changes); c) infrastructure and engineering processes are not fully aware of work items
In these two areas and yet others in developer infrastructure, a combination of ultra large scale data processing, data mining, data extraction and cleaning, and logging integration, heuristics and ML is needed. Also, we are not simply dealing with individual programs, monolithic systems, or well-defined systems of systems – instead, we are dealing with a heterogeneous, rapidly evolving software ecosystem – for instance, in terms of how different version control systems, continuous integration approaches, language implementations, IDEs, project management and documentation tools, and data warehouse technologies are assembled to provide one infrastructure on which, in turn, the actual apps and backend services depend. Further, development is highly distributed, depends on evolving team structures and responsibilities, and the relevant employee workflows are rather involved, thereby also entering the realm of global software engineering.
The talk extracts a research agenda from experiences with these areas. None of the discussed challenges and problems are specific to the Facebook setting and, in fact, much of the progress can be expected to be achieved in the context of research on open-source ecosystems.
 John Ahlgren, Maria Eugenia Berezin, Kinga Bojarczuk, Johann George, Natalija Gucevska, Mark Harman, Shan He, Ralf Lämmel, Erik Meijer, Silvia Sapora, and Justin Spahr-Summers. 2020. Ownership at Large – Open Problems and Challenges in Ownership Management. In Proceedings of the 28th IEEE/ACM International Conference on Program Comprehension (ICPC Industry Track). IEEE / ACM.
 Ralf Lämmel, Alvin Kerber, and Liane Praza. 2020. Understanding What Software Engineers Are Working on – The Work-Item Prediction Challenge. In Proceedings of the 28th IEEE/ACM International Conference on Program Comprehension (ICPC Industry Track). IEEE / ACM.
|Slide deck for keynote (keynote-at-icpc-2020-less-cats.pdf)||10.20MiB|
Mon 13 Jul Times are displayed in time zone: (UTC) Coordinated Universal Time change
|13:30 - 14:30|
K: Ralf LämmelFacebook LondonMedia Attached File Attached