EDGE AI POD
Discover the cutting-edge world of energy-efficient machine learning, edge AI, hardware accelerators, software algorithms, and real-world use cases with this podcast feed from all things in the world's largest EDGE AI community.
These are shows like EDGE AI Talks, EDGE AI Blueprints as well as EDGE AI FOUNDATION event talks on a range of research, product and business topics.
Join us to stay informed and inspired!
EDGE AI POD
When Edge AI Meets Hearing Loss, Access Gets Real
Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.
Crowded cafés, clinking plates, and echoey halls make conversations exhausting. We set out to change that by fitting real deep learning into an ear-sized device and proving it can separate speech from noise with almost no delay or battery hit. The result isn’t louder sound; it’s clearer lives and less fatigue.
We walk through the full Clara enhancement path: transforming raw mic input into log-mel features, stabilizing for gain shifts, and feeding a 40-layer temporal convolutional recurrent network that predicts a mask to preserve voice and suppress noise. Then we show how a light touch of the original signal brings back space and warmth, avoiding the hollow, underwater audio that turns people off. Along the way, we tackle painful transients—the cutlery and clatter that spike hearing aids—and explain how wide dynamic range compression keeps everything comfortable and intelligible.
The heart of the story is edge AI done right. Our SPU001 chip uses unstructured sparsity to skip zero multiplies in hardware, shrinking memory needs and power draw by orders of magnitude. That lets a pruned model with effective 10 MB scale run from just one MB of SRAM while holding algorithmic latency near eight milliseconds and total path time under ten. Metrics back it up: higher scale-invariant signal-to-distortion ratios, better hearing aid speech quality scores, and strong user reports. A rapid partnership with New Sound brought this to market in about three months, and audiologists on a noisy show floor heard the difference immediately.
If you care about hearing tech, edge computing, or just making conversations effortless again, this one is for you. Hear how small silicon and smart modeling turn “AI” from a buzzword into a daily benefit. Subscribe for more deep dives on practical edge AI, share with someone who struggles in noisy rooms, and leave a review with your toughest audio environment—we might feature it next.
Learn more about the EDGE AI FOUNDATION - edgeaifoundation.org
Setting The Stage And Speaker Intro
SPEAKER_00Okay, awesome, great.
The Hearing Aid Problem Landscape
SPEAKER_00So I will be talking about squeezing intelligence into the ear, a practical deployment of AI speech enhancement and hearing aids. Before I get started, I'm just gonna introduce Femto AI a little bit. So hopefully the video works. There we go. We are really proud of this animation. Uh so FemtoAI, we were founded in 2018. We used to be called FemtoSense up until last Thursday. So if you've you know not heard this name before, don't worry, um, it's new. And so what we do is we're all about empowering empowering intelligence and everyday devices. That means you know bringing this AI accelerated chip into really small uh edge applications. And a bit about me. So my name is Pong. I'm the director of strategy and chief of staff at FETO AI. Um I'm actually substituting for our senior deep learning engineer, Raphael, who could not join us um at the conference this week. And you may notice, Raphael, he's very technical. You look at me, I'm not technical. So you're welcome to ask any of the questions afterwards. Uh I can forward it to the engineers if I cannot answer them. Alright, so now let's get into the talk. Hearing aids. Um according to WHO, it turns out that 17% of people who who could benefit from hearing aids don't actually end up using them. I'm looking around and I see a lot of people with glasses. Now imagine if 17% of people who can use glasses don't use glasses. A lot of people that you know won't be able to see. With hearing aids, a lot of people aren't able to hear clearly. And we wonder why is that? The top reason for that is that you know people realize that you know there are poor benefits from hearing aids. Even if you wear them, it doesn't actually help with your hearing. And a second reason uh that people stop using hearing aids is that they don't perform well in noisy situations. Um I have a little demo downstairs that can demonstrate what this means. But essentially, what happens with hearing aids now is that there's a microphone, it takes a signal, it does maybe some filtering, uh, and then it boosts everything. So noise is amplified, your speech is amplified, um, everything's amplified.
Why Classic Hearing Aids Amplify Noise
SPEAKER_00So, how can we solve it? Well, we're in an AI conference, so let's throw some deep learning at it. Great, problem solved. So deep learning, it can separate the speech from noise, check. Deep learning, it can also preserve speech quality, check. And then one of the biggest problems with um hearing aids right now is if you have a sharp impulse, let's say cutlery, for example, when you're out having dinner, it's like super high frequency, super high amplitude, and it's really, really painful. So deep learning, great. You can also detect for that and then solve that. And in fact, you can experience this sort of um deep learning approach to um noise reduction in Google Meets, Zoom, um Teams right now. All right, so thank you for attending my talk. Um, we're done, we solved the problem. Goodbye. No, we're actually at the edge AI talk. We're about connecting AI to the real world. So let's think about deployment. Okay, let's strap a GPU onto a hearing aid.
Deep Learning Fixes On Paper
SPEAKER_00Okay, could that work? Um maybe not. Because right now we see that okay, with power of typical um noise production algorithms, it takes about like one to ten watts. For latency, if you're doing something like Google Meets call, um, you know, like you have the latency budget. It doesn't matter if you're hearing real time or not, like 50 milliseconds is totally fine. And the size, I won't go into details, but it's big. So in a hearing aid deployment, um power, we want to make sure
Edge Constraints: Power, Latency, Size
SPEAKER_00that it actually lasts in a hearing aid form factor for a whole day. So 20 hours, assuming you sleep for at least four hours a day. Um and in a hearing aid size battery, that translates to about one milliwatt of power that can be allocated to the AI part. For latency, um, you want to make sure it's perceived as real time. So that translates to about less than 10 milliseconds uh of latency. And then for a size to fit in a form factor, it has to be tiny. So we have a solution, we call it the Clara AI speech enhancement, and this is the flow. Uh once again, I want to remind you that I am not a technical person, so I will just read my notes here of what the components mean. So we start on the top left hand side with a noisy audio input coming from the microphone. And then we bring it into our chip, the SPU001, and we do an uh forage transform on this to turn from the time domain to frequency domain. Followed by that, we do log mill transform, um, and that basically puts you know the signal into like frequency bins uh and you know that kind of maps to what the hearing, the human hearing experience is like. Then
Clara Pipeline: From Mic To Clean Speech
SPEAKER_00we do some uh gate invariants like the delta MFE um just to make sure that, okay, it's gate invariant. Here comes the fun part. Then we feed it into a DNN. I will go into more details later. After that, you know, DNN predicts a mask um that can be used to, you know, multiply like with the um signal to then generate this enhanced um STFT that you know has the speech but does not have the noise. And then we inject that back to the top signal flow. I realize I have a laser. Ta-da! Um and then here we can do some remixing where if you want to preserve some of the original noisy signal just for like to make it sound natural, you can maybe inject maybe like 20% of this, and then you have like you know the rest, 80%, be this cleanup signal, uh, and finally have your uh clean output. So earlier I mentioned the DNN part. Let's dig a little bit deeper into that. Oh yeah, before I get into that, uh, in addition to all this wonderful stuff we just talked about, uh, we can also add a lot of other
Inside The 40-Layer DNN
SPEAKER_00uh signal processing to this as well. Uh, one of the fun ones is the wide dynamic range compression, which is um really helpful in hearing aids. But okay, back to the DNN. What is in here? Uh we have a 40 layer uh temporal uh convolutional recurrent neural network. Uh so the temporal convolution part is just a sliding window um over time, and then uh yeah, the recurrent neural network, yeah, it is able to do the wonderful predictions. Um I don't know anything further than that, not a technical person. All right, so how are we able to fit that 40 layer uh network onto a chip?
Sparsity And The SPU001 Chip
SPEAKER_00So at FEMTOAI, uh we're all about you know using the technique called sparsity. It was inspired by like neuromorphic research that the founders uh did at Stanford University, um, but then pivoted over to purely digital design. So a bit about sparsity. I think a lot of people in the audience probably knows about it more than I do, but I'm just gonna do a high-level overview. So neural networks is just a bunch of matrix operations, mostly multiplication. It turns out that you know in these matrices there are some values that have really low um magnitudes, so like they're really close to zero. And if you prune them down to zero, it doesn't really affect the accuracy of the model uh significantly. So let's say we go ahead and prune it down to zero. If you multiply anything by zero, it's zero. And so, you know, conventional approaches might just do the operation anyway. But what we have uh with our chip is custom instruction set that once it detects the zero in like the unstructured sparsity, we're able to skip the calculations. And so the result of that, let's say we do 90% pruning, uh, then we're able to you know save the memory footprint by 10x, and then the power consumption actually by 100x, because if we have double dual sparsity of both the weights and the activations, it multiplies. And you know, it looks great in theory. And it's also reality. So we have this tiny little chip called the SPU001, which I'm gonna hold right here, but I don't know if you're able to see it because it is absolutely tiny. Um you can also just yeah, okay, I see you're all squinting your eyes um at this. Uh that's a tweezer. Those are fingers. Small. Uh it is about 3.5 um millimeter squared. And you know, in it, we say it has 10 megabytes of effective
Real-World Metrics And Results
SPEAKER_00memory. So what does effective mean here? So, okay, so it has one megabyte of raw SRAM. And then let's say if you do let's say 90% pruning, then you can fit, in theory, um a 10 megabyte model in the one megabyte footprint. And then in terms of efficiency running like the algorithm I showed you earlier, we're talking about the microwatt scale, maybe up to like one 1.2 milliwatts. So let's see, like going back to the earlier part about noise reduction, okay, the experience of the hearing aid user, they care about these three main things. And then for deployment, these are the things that are the constraints that we have to work with. Going into this first part, um we have a few results. So on the y-axis here we have, let's say, the signal signal-to-noise ratio, um, where the higher the number, uh, you know, the more signal, like let's say clean speech there is to the noise that we don't want to supp um like that we want to suppress. The blue line is the enhanced signal, uh, whereas the orange line down here is the noisy signal. So what is SISDR? It is a scale invariant signal to distortion ratio. The higher um the score here, it means that it is a clearer signal. So you see that you know across different SNRs, once we enhance it, no surprise, it has um greater like uh signal-to-noise ratio. Not signal to noise ratio, uh great clearer signal. The next metric that we have here is what's called the HASCI metric. So it's you know a really common metric in the hearing aid development. Uh it stands for the hearing aid speech quality index. So the higher the
Live Form Factor And Battery Claims
SPEAKER_00number, the greater the the greater the um speech quality. So great. We're able to have clearer signal, greater speech quality. And then finally, the last part, um, after all, like if you're wearing hearing aids, you know, how does it actually feel? So here we have some metrics about um the actual experience that users have. And no surprise here, when we enhance a signal, it's a greater it's a better experience. So you know, all of these things are now we're able to do. And now moving on to the deployment part. So over here in my hands, I have hearing aid with the chips actually in here, and then also the algorithm running on it. And running the algorithm AI for the whole time, it is able, the battery is able to last for 20 hours. The latency that's in here, it's for all for the algorithm itself, it's about I think eight milliseconds. And so overall, the signal from input coming in to hearing, uh, it's about less than 10 milliseconds. And then finally, well, the chip is in here, it's working, it does fit in the form factor. Alright, so checked all the boxes. Uh and I just want to talk a little bit about um the you know this hearing aid.
Partner Launch With New Sound
SPEAKER_00So it's made by a company called New Sound, um, based in China, and uh uh they launched it early this year in March at the AAA conference. It's the American Audiology Association, if I remember correctly. Um and you know, there were a lot of audiologists who came to try this out. And this is the part where I think is pretty interesting. Uh they walked around you know the show floor, they tried out different AI hearing aids, and then it came to you know our booth, tried this. And then one of the quotes that they said was I walked around, tried a bunch of AI hearing aids, I thought AI was a buzzword. Then it tried this, and they're like, oh, okay, now I finally hear the difference. So with that, I actually want to invite you to come to a demo table downstairs and actually hear this. Another thing I want to note about New Sound is that they got in touch with us like a few months ago, and then they were like, all right, okay, we're interested in your chip, let's try it out, let's build it in. And then they were able to just build everything in in three months, and we're like, whoa, okay, like that's pretty fast. Because if you're familiar with hardware development cycles, it can be really long. So huge kudos to our friends at New Sound. There's gonna be a lot more hearing aids uh
Demos, Next Steps, And Invitations
SPEAKER_00with our chips building coming soon, so there's a little sales pitch. Um, yeah, it'll be fun. And also other devices too. Um all right, and you know, with that, come to the end of the presentation. Thank you for being here, thank you for listening. Uh just wanna you know share a few other things. So, yeah, please come try out the demo at our demo table. If you know someone who uses hearing aids um or who could benefit from a hearing aid, let me know. We can put you in touch to get one of these, or maybe any other hearing aids and like share more. Uh if you're interested in deploying your own algorithms onto a chip, uh, we do have a developer portal uh that you can go in and then try that. And then finally, uh next week, my colleague John will be giving a talk in Geneva at the AI for good global summit, where he'll be actually talking about the hearing aid and how like edge AI can actually help um you know more people benefit from this technology. So thank you, and now open for questions.