The choices which enable us to manage data from 12 million meters
The Datanumia team is driven by a mission to secure a more sustainable future for the planet. With 12 million consumers on our platform, our focus every day is to simplify the digital experience of energy consumers.
Managing 12 million data points is not an easy task, for all the reasons you would imagine: data hygiene, processing time, security, analytics, and so on.
Furthermore, energy data is highly sensitive consumer data. From our first project with EDF, the largest retailer in France, we understood how critical it was to have a robust, secure platform which both retailers and consumers could trust.
As a software provider a high-stakes environment, we had to make smart technology choices given the infrastructure changes taking place in the energy sector. For instance, we knew smart meters would become the norm (60% of our data points are now from smart meters). This meant we needed high data processing speed for large amounts of data.
This need quickly steered us towards Lambda Architecture.
The idea behind Lambda Architecture is to balance data processing time and accuracy by using a combination of batch processing and real-time stream processing.
In other words, while high-accuracy batch data is being processed, lower accuracy stream data can be surfaced quickly to the end user. Here is how this plays out in practice for Datanumia. An electrical smart meter generates one data point every 5 to 30 minutes. This adds up to several billion data points each day across all our customers.
In addition to meter data, we receive a variety of other data points from energy retailers such as end user profiles, contract information, rates, and so on.
The incoming data are stored in immutable datasets which contain the historical data for all users.All these data are processed in a nightly batch in order to compute energy consumption, usage breakdowns, peer comparisons and other useful insights. These insights are then stored in a dedicated database. This nightly process focuses mainly on accuracy by processing all the data.
However, this approach presents a problem when a user wants to see energy consumption for the current day: the usermay get outdated information if the batch has not been executed yet (although we already received this user’s meter readings).
Lambda architecture enables us to overcome this latency
When we receive data from energy suppliers, we dispatch them to another system which computes and stores exactly the same insights as the ones generated through the batch.
The results may be less accurate at times, but the calculations are fast as they are not performed on all data but only on the last received data. Therefore, when the front-end requests an energy consumption value, an intermediate component is in charge of executing the queries against the different calculations storage sets, blending the results and, lastly, returning the resolved results
This architecture enables us to process huge amounts of data while delivering useful insights at lightning speed without sacrificing too much accuracy.
The diagram below summarizes this architecture:
From a technical standpoint, our stack is made up of a mix of the following technologies: Kafka, Spark, Cassandra and plain vanilla Java.
Our architecture and tool choices give us the agility to:
• Provide analytical value to energy retailers in a diverse landscape of needs and priorities
• Evolve with the market as new technologies, such as AI, become widely adopted
With 12 million consumers on our platform, Datanumia is the customer engagement leader in Europe for the energy sector.
Come learn how we help energy retailers better engage customers around their consumption, connected devices, carbon footprint and many more use cases : for more information or a demo, please get in touch via the contact form or directly via email@example.com