EDGE AI POD
Discover the cutting-edge world of energy-efficient machine learning, edge AI, hardware accelerators, software algorithms, and real-world use cases with this podcast feed from all things in the world's largest EDGE AI community.
These are shows like EDGE AI Talks, EDGE AI Blueprints as well as EDGE AI FOUNDATION event talks on a range of research, product and business topics.
Join us to stay informed and inspired!
EDGE AI POD
A Unified Neuromorphic Platform for Sparse, Low Power Computation
Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.
Sensors are flooding the edge with data while CPUs juggle denoising, formatting, and inference. We built ADA to flip that script: a Turing-complete neuromorphic processor that computes with time-encoded spikes, slashing power, latency, and memory movement by keeping work inside an event-driven pipeline.
We start by unpacking why conventional embedded architectures stall under modern workloads, from pre-processing bottlenecks to compromised security on battery-powered devices. Then we break down neuromorphic fundamentals—how spikes encode information and why sparsity matters—and compare general-purpose frameworks, highlighting the trade-offs that often inflate activity or force manual design. From there, we explain why we chose interval coding and how we solved its biggest flaw. By predicting future spike times, ADA avoids per-tick updates, reducing complexity from linear to logarithmic with precision and mapping neatly to simple add, multiply, and shift hardware.
You’ll hear how the architecture comes together: a tiny neuron core that fits in modest FPGAs, standard interfaces like UART and AER for DVS cameras, and our Axon SDK that compiles Python, NumPy, or C algorithms into deployable binaries—no neuron micromanagement required. We demo a three-tap FIR filter built from modular primitives and show ADA acting as a programmable pre-processing element for event vision. On the DVS128 gesture dataset, ADA’s spatial-temporal denoising cut downstream compute by over 50%, keeping the pipeline sparse and fast.
Security gets equal attention. We extended the primitive set with modulus arithmetic to support polynomial math central to post-quantum cryptography such as Kyber. The result: 5x better power efficiency and a 2.5x improvement in energy-latency product over MCU baselines, with clear paths to reduce latency further. It points to neuromorphic cryptography that protects implants and IoT sensors without sacrificing battery life.
Ready to try it? The Axon SDK is publicly available. Give ADA a spin, share your toughest edge workload, and subscribe for more deep dives into neuromorphic computing. If this sparked ideas, leave a review and pass it to a friend building at the edge.
Learn more about the EDGE AI FOUNDATION - edgeaifoundation.org
Why Edge Hardware Is Failing
SPEAKER_00Today we are very excited and very proud to present our first ever neuromorphic platform, ADA. So ADA is a Turing complete neuromorphic processor and computational framework. Unlike traditional neuromorphic solutions, which mostly focus on AI acceleration, ADA is designed as a general-purpose neuromorphic processor. So it leverages brain-inspired technology from neuromorphic computations to deliver spars and low power power products to combat real-world problems like latency, power, and memory. So without further ado, let's quickly go through the agenda for today. So first we're going to start off with our motivations behind developing ADA. Then we're going to go as a refresher into the neuromorphic concepts. Then we move on to look into some general purpose neuromorphic frameworks. And then we deep dive into the actual hardware of our ADA chip. And then we finish off strong with our use cases and our future for ADA. So as we all know, the world is producing more than enough data for today, be it from video signals, audio signals, or biosignals. All of this huge amount of data is flowing through edge devices. And the so basically the embedded architectures of today are really struggling to keep up. So there's a huge growing mismatch between the demands for modern algorithms of today and the hardware architecture solutions that we have. So basically, mostly for edge devices, they combat two problems, be it sensor to inference bottlenecks, like for example, in case of pre-processing data and denoising, which suffer from software and hardware constraints, or to data security, like for example, in case of implantable medical devices. Some people they have to turn off encryptions just to save on power, which can leave sensitive data exposed. And this is not ideal case. So based of to uh to combat these solutions, so basically the manufacturers of today they have to choose between really inflexible but um high performance ASIC solutions or flexible but underpowered microcontrollers or uh really high uh really bulky FPGAs. So each come with their own set of drawbacks, be it high power, low flexibility, or um uh the need for specialized expertise. So with all these in mind, we developed ADA and Francesco RCFO um will introdu RCSO will introduce how ADA will fit as a solution in all of these.
Neuromorphic Basics And Sparsity
Survey Of General Frameworks
Why Stick And How It Works
SPEAKER_01Thank you, Abir. So we think at Newcombe, we think that the solution lies in uh moving to a novel computing paradigm to solve the existing tech limits. So that means venturing into unconventional computing, which is the discipline that studies how processing of data is done in natural systems in order to hopefully reverse engineer this principle and come with new devices, algorithms, and uh ways of processing information that takes inspiration from this system. So among the most recent or famous ones, we have optical computing and quantum computing, but then uh we also have neuromorphic computing, which is what we're looking at at a newcom and what we're building our tech on. And neuromorphic computing is an abstraction of how the uh information is processed in real biological neurons. So as you can see here, neurons operate with voltage traces, and neuromorphic computing takes these voltage traces, simplifies them into a Boolean abstraction called spikes, and carries out computation by uh propagating these spikes in neural networks. So here, uh very simplified mechanism of how this works in a spiking neuron. You have inputs coming to the neuron over time, so these series of spikes reach the neuron, and then there's a threshold comparison. So if enough spikes come into the neuron, then the neuron fires out other spikes, and these form the output to the following neurons in the network. So this is very attractive because it creates sparsity in in the network. So it allows to save a lot of energy by not crunching numbers but processing these Boolean events over time. And we're already seeing that in uh in neuromorphic, uh existing neuromorphic devices. So there's already proof that this um uh this type of technology can scale beyond the digital circuit limits, um, even though it's still kind of an early tech. So it's the design are still based on stable digital platforms, but we expect these uh uh power savings to be even um bigger as we move to analog or mixed signal uh substrates because this will create even more um sparsity or it will allow to have more sophisticated processing of this information. Uh however, even if we scale the tech, we still believe there's more to take advantage out of neuromorphic by moving the spike-based computation beyond just inference core as there is now, and essentially cram more of the edge AI pipeline into this spike-based framework. Because we think that this could allow to have less reliance on the CPU, so less data movement or less um yeah, CPU leverage for processing the actual data. And we think this could lead also to an increased end-to-end efficiency, meaning in the whole edge I pipeline from sensor to inference core, this could allow to have more savings throughout the way. So we started by also asking ourselves how what can be done now and what are the computations that could benefit most from these substrates? And to answer the question, there's there's a few general purpose frameworks that are out there, let's say. And we're just gonna present a few of these. The first one that we uh started looking into is called Neuroengineering Framework, NATH, which is a very uh powerful framework. It allows us to have uh to model dynamical systems out of rate-based neurons, but these create uh so it's it's very good to uh simulate state-space problems, but being rate-based neurons, uh this creates a lot of spikes. So it's uh it's um a very dense spiking activity, and this is not really what we want for sparsity. Also, it's there's noise reading, so there's some overhead when cleaning the signal out of this population uh activity. Um the second framework, which is um is called Fugu, it's a very modular framework, it's incredibly flexible, so it allows to um model a lot of different algorithms, but its flexibility comes at the cost of the user. So algorithms have to be, let's say, manually defined, or they'll be built by stacking these um components together, but in a way that is not really um standardized. It's a lot of let's say empirical way of building uh these algorithms through the SDK. And finally, we uh looked into stick, the spike time interval computational kernel. So in this um in this framework, as the name suggests, the values are encoded in the time between the consecutive spikes. So the interval between two consecutive spikes is what encodes the numbers. And this allows you to have, again, extreme flexibility. It's a Turing complete framework. But the problem relies on the fact that because of this encoding strategy, the higher the resolution, the longer the time needed to um faithfully encode numbers in these spike intervals. Uh Dima is gonna actually show how we solve this, but that is one of the drawbacks of the vanilla implementation of Stick that can create a big latency. Also, there's limited real world validation because this has been proposed um, let's say almost 10 years ago as a journal uh article, but then it's it hasn't been shown a lot of application in in the real world. So these are let's say all the frameworks that we explore with some of the drawbacks, but we like the stick so much that we're sticking with it, no pun intended. Um so as I mentioned, it's a time-based um framework. So it allows to convert from a continuous domain to a time interval domain, and vice versa. And the way it works is ultra sparse. So that means that not only just there's two spikes needed per encoding, but also the networks that are built with stick have very low synaptic fun in and fun out. So that means that the spikes traveling through the networks are very um sparse. There's not a lot of spike traveling simultaneously in the networks. And also the another great point of it is it's during completeness because of its modularity. Because there's a few computational primitives that are defined as very small networks. We're talking about roughly between the five and ten neurons each. And these networks can be stuck together, so they can be composed by encapsulating the primitives uh into a block in which only the input and output neurons are exposed as interfaces. And by just combining these into um into a more complex uh, let's say, modular algorithm, one can replicate any function that you want. Note that all of these components, they just manipulate the time interval that they receive encoded as two spices. So it's just by manipulating the time interval and reading the corresponding interval that is sent as output, we can actually do this computation. So now Dima, our CTO, is gonna tell you all about our invention.
ADA Architecture And SDK
Solving Temporal Precision Latency
FIR Demo And Event Pipelines
DVS Denoising And Data Reduction
Cryptography Primitives On ADA
Benchmarks, Roadmap, And Open SDK
SPEAKER_02Thank you. So uh here we have ADA, which is uh uh stands for Asynchronous Data Flow Architecture. And basically what we're trying to do here is take this stick interval coding framework and apply it to an embedded architecture and a hardware platform as well as a software environment to deploy this kind of computation that we we kind of identified before with data security and the entire sensor to inference pipeline, and basically apply this interval coding framework to allow neuromorphic computation on this uh embedded uh hardware platform. So here we have uh we're kind of taking advantage of the extensibility that the STIC framework provides and really trying to implement a hardware architecture that uh uh achieves this neuromorphic benefit for a lot of these different classes of computation. Uh so Ada at its core is generally a it's it's an engine for executing these interval coded uh spike neural networks. It's fully programmable in the sense that an algorithm that represents or a spiking neural network uh representation using this interval coding framework that represents some class of computation uh can be loaded and re dynamically loaded onto this uh device. And we provide standard interfaces for moving data to and from chip using the encoder uh logic and for certain applications. You may want to use a UART or AER for uh DVS cameras uh and for for both uh sparse or continuous data flow. Uh we also provide a standard uh interface for the tile leak, which is allows us to configure and load the model by a whole CPU. And also in terms of the actual neuron core, this fits within a 16,000 lot FPGA, the act uh apart from the actual neuron logic. And this is this is very small in terms of area, and this is kind of the benefit that we're trying to expose with this neuromorphic framework that we can actually reduce the amount of computation for uh by using this interval coding architecture. Now, on top of this, uh a big challenge that we want is that or that we want to address is also the fact that uh we don't want uh people to have to develop and implement this interval coding networks and even be too much exposed to this architecture whatsoever. Now, how we kind of address this, we we exposed a it's what we call axon SDK, which is the it's basically a suite of tools that allows you to take a some computational uh representation in Python, NumPy, or C, and then deploy this through a compiler and then export a binary file that just is represents the actual uh neural and synaptic connections within the actual ADA device. And this is loaded over a runtime API, which is part of your companion microcontroller. And this kind of allows you to actually make this move move an algorithm all the way to our device without actually having to interact with any training uh or managing the network network whatsoever. Now, if you wanted to do that, we also provide a network simulator, which is also open sourced, and this allows uh extension or implementation of new modular primitives, one of which we'll uh introduce in a bit, and a hardware emulator, which kind of gives you an idea of what are the resource usage, the timing estimates in terms of latency that you can expect, apart from just deploying it on the device. So, one of the good benefits of the uh encoding and decoding architecture is the ability to kind of scale up the amount of data flow that is being brought from the continuous domain to this interval coded domain. Uh just this is just kind of giving an idea of what the encoding architecture looks like on a combinational circuit. Uh quite simple to implement, highly configurable for different precisions of data types from in a continuous domain moving to the spiking event-based uh area. Now, uh one of the issues with this temporal-based processing that we found in the spiking neural network representation using Stick is basically when you're dealing with higher levels of precision, we see that uh this basically requires a larger window of execution. And what that means is that the amount of neural updates that we need to do is scales literally with the per with the size of the precision. And in terms of our neural core, one kind of idea we had was when an event occurs, we can actually predict in advance the time when this neuron will fire in the future. And what this allows us to do is we don't have to do any neural updates within this entire process. So this actually reduces the amount of computation from scaling linearly to logarithmically, which is a huge benefit in terms of um latency and also in terms of generally performance and and area. Um and in terms of the neural update, this the idea here is that the uh neural update is composed only of add multiplying shift, which is very cheap to implement on hardware. So this kind of allows us to create more complex functionality by just using primitive uh neural updates with simple hardware blocks. Now, now to get into maybe some more practical use cases, I think we'll start with uh a simple one. This is a three-tap finite impulse response filter. Uh we have the encoded interval coming here. It goes through a multiplication block, which is which is one of the primitives that stick provides. It's multiplied by a constant and then passes through a synchronizer from the previous x1 minus xn minus one and x of n minus two, and then synchronized so that these intervals are all synchronized at the same execution window, and then passed through a linear combination block, which implements the function for FIR. Uh this is only 110 neurons, so we can implement this. It kind of just shows how the computational graph looks. This is the output from and from a logical perspective of what a compiler would generate, and it kind of describes how a meet a fairly complex computation is represented using these stick primitives. Now, one use case that is particularly exciting for us is actually fitting ADA within this entire uh with a audio or video or event-based uh processing stream. And the idea here is that ADA is a programmable processing element that can implement denoising, uh spatial temporal encoding, and kind of remove the CPU from any kind of pipeline or processing with within this entire uh core data flow. So where a certain routine for maybe tensor formatting uh that would be done in, for example, TensorFlow or spiking jelly in the case of using spiking neural networks, uh, we can actually apply all this and basically treat ADA as a hardware accelerated autoencoding layer. And this remains entirely in the event domain within ADA, and we're able to apply or basically hoist the CPU completely out of the execution process. And in terms of latency and memory movement, we significantly significantly reduce the amount of computation that's not being done on accelerated hardware. Now, just as an example of this, we basically applied um spatial temporal denoising to the DBS 128 gesture data set, and we actually found that it significantly reduces the amount of uh computation using the ADA-based network that uh applies this function uh by over 50%. So, in terms of the actual data reduction that is actually being uh used for the model for classification uh of adjusters, we can see that the amount of data that we're that's actually being uh processed at the end of this pipeline is significantly lower. So we see this as a kind of a nice fit in terms of also applied for audio, uh vision video, and this is specifically for the DVS event-based camera. So this is something that's very interesting for us. Now, and going further, we did mention also the data security pipeline, and this is something where we had to spend some time and actually extend this framework even further. Uh, and it's I think very uh novel from our end in terms of uh trying to apply cryptography to a neuromorphic framework. And that's uh what that took is that we actually had to apply a new modulus primitive because this is a very uh important operation or primitive operation within the cryptography domain, where most arithmetic is done using with an extension using modulus, so that everything stays within a finite field or finite polynomial ring. Um and in the case of, for example, a post-quantum cryptography algorithm called uh kyber or key uh MLCAM, which is a key encapsulation mechanism, uh we actually see that over 50% of the operations really boil down to actually doing these modulus arithmetic. So we kind of wanted to address this uh main issue uh or computational bottleneck with uh this neuromorphic framework and see if the how how we can apply a lower power neuromorphic cryptography to uh to an IoT device where both quantum cryptography and even security in general is already quite a challenge. Can we even can we apply this and apply this computation to uh the aided device? Now, in terms of actually implementing this network, it really boils it down to something called polynomial multiplication. And really what we're trying to show here is how a higher level network is constructed based on simple primitives that are extended. So you can see the butterfly operations, these are basically implementing on um elements within a polynomial, so that we can do something called the uh number theoretic transform. And um and this basically actually was uh doing the yeah, did the polynomial multiplication, same same how it would be done on software, and we benchmarked it compared to uh MCU benchmarks from uh NIST, which is a standardization body, and we still saw that we actually achieved five times uh better power efficiency and the energy lay product being 2.5 times better. Uh latency is slightly higher because of some uh digital scheduling logic, which we expect to optimize in future device revisions. But in general, we see this as a promising first development and deployment of and realization of the ADA product. Now, in terms of summary and future use cases, kind of where we want to go with this, is we want to harden the actual device on a more mixed on a mixed signal domain for to really squeeze more uh opportunity for power efficiency and really explore some some further use cases and hard on these use cases. So some of the ones that we see a good fit for are audio, video, and uh potential cryptography as well. Um and with this, we've also exposed a uh our Axon SDK, which is our software environment, and we expose this. We have publicly released this for anyone to try this network out, and we encourage anyone with uh use cases in mind uh with questions to feel free to reach out.