
Predicting Photovoltaics: AI Models With High Data Protection Deliver Accurate Results
The expansion of photovoltaic systems at the household level is progressing rapidly. This increases the demand for reliable electricity generation forecasts, which are required for stable grid operation, energy communities, and flexibility management, for example. However, energy data from households is considered particularly sensitive and is subject to strict data protection requirements. A research team from Salzburg Research/Intelligent Connectivity and Paris Lodron University of Salzburg has now demonstrated that federated AI models can significantly mitigate this conflict of objectives.
In a recent study, our colleagues compared traditional, centrally trained prediction models with federated, edge-based learning approaches. While central models collect all data in one place, federated learning stores the measurement data directly in households or at local gateways. Only anonymized model updates are exchanged and merged centrally. The results of the study were presented at the prestigious Innovative Smart Grid Technologies Europe 2025 conference, organized by the IEEE.
Learning at the edge of the network instead of central data collection
The study analyzed several common machine learning methods, including tree-based ensemble models and deep neural networks. Realistic photovoltaic and weather data, simulated for several prosumer households, formed the basis of the study. The evaluation focused on forecast accuracy, computational effort, and sustainability for data protection-friendly applications.
The results show that, although centrally trained models achieve the highest accuracy, this requires the aggregation of all sensitive household data. Federated models, on the other hand, deliver a forecast quality that is almost comparable, especially when multiple households learn together in a federation.
Robust compromise between accuracy and data protection
A tree-based method known as histogram gradient boosting was found to be particularly suitable. It strikes a robust balance between prediction quality, computing power, and data protection. Although deep neural networks can also achieve very good results, they are significantly more resource-intensive and less efficient in smaller federations.
The study suggests that a federated approach involving around a dozen participating households is particularly promising. This approach enables good predictive performance while maintaining a high level of data protection and a moderate training effort.
Relevance for energy communities and networks
The results are particularly relevant for collaborative applications, such as energy communities, local flexibility markets, and prosumer-oriented grid control. More accurate forecasts can help to reduce peak loads, make better use of local generation, and enable the more intelligent planning of flexible applications, such as time-shifted charging or energy-intensive processes. All of this can be achieved without collecting sensitive household data centrally.
The study therefore highlights the importance of AI solutions that protect data as a vital part of the energy transition. Federated learning opens up new possibilities for photovoltaic forecasting, even in areas where data protection has previously posed a significant challenge, such as among distribution network operators and local energy communities.
The project was funded by the State of Salzburg’s “Excellence in Digital Sciences and Interdisciplinary Technologies (EXDIGIT)” initiative.
Publication: Narges Mehran; Peter Dorfinger; Nicola Leschke; Frank Pallas (2025): CF-PV: Centralized vs. Federated Edge-based Prediction Models for PV Energy Production In: 2025 IEEE PES Innovative Smart Grid Technologies Conference Europe (ISGT Europe).