The emerging Federated Learning (FL) paradigm offers significant advantages over the traditional centralized architecture of machine learning (ML) systems by reducing privacy risks and distributing computational load. However, the network topology (i.e., the number of available clients, and their characteristics) has a critical impact on performance metrics. This work investigates how application-specific requirements can drive architectural choices and how such choices impact FL performance. Specifically, we present a requirement-driven reference architecture for FL applications. Using a standard benchmark, we empirically evaluate 20 architecture realizations in different boundary conditions. The effectiveness of each realization is assessed based on the accuracy of the trained model and the wall-clock time required to complete the training. By combining our experimental results with existing qualitative studies from the literature, we devise a guideline to help prospective users select the most suitable configuration based on their application-specific non-functional requirements.