
EDGE AI POD
Discover the cutting-edge world of energy-efficient machine learning, edge AI, hardware accelerators, software algorithms, and real-world use cases with this podcast feed from all things in the world's largest EDGE AI community.
These are shows like EDGE AI TALKS, EDGE AI BLUEPRINTS as well as EDGE AI FOUNDATION event talks on a range of research, product and business topics.
Join us to stay informed and inspired!
EDGE AI POD
From Cloud to Edge: NXP's AI Journey with Ali Ors
The line between cloud and edge AI is blurring, and NXP Semiconductors stands at the forefront of this transformation. In this illuminating conversation, Ali Ors, Head of AI Strategy and Technologies at NXP, unveils how the semiconductor giant is embedding dedicated AI acceleration across their entire product portfolio—from basic microcontrollers to sophisticated application processors.
With nearly a decade in the AI space, NXP has developed a comprehensive approach that doesn't position edge against cloud, but rather sees them as complementary forces. "We're not trying to make AI easy," Trescot explains, "we're trying to make it easier" for developers to harness AI capabilities in resource-constrained environments. This philosophy drives their hardware development and software enablement strategy, allowing customers across automotive, industrial, and IoT sectors to deploy sophisticated AI solutions where they're needed most.
Perhaps most fascinating is NXP's advancement in bringing generative AI to the edge. Their imx9.5 processor now supports Large Language Models and Vision Language Models with up to 8 billion parameters, shifting the primary constraint from computation to memory limitations. This capability is already finding practical applications in conversational interfaces and scene understanding across industries, where hallucination-free, fact-based responses are non-negotiable. With their pending Kinara acquisition and active participation in the Edge AI Foundation, NXP continues to expand their ecosystem and push the boundaries of what's possible at the intelligent edge. Curious about how edge AI could transform your industry? Dive into this episode to discover the silicon that's making it happen.
Learn more about the EDGE AI FOUNDATION - edgeaifoundation.org
ollie, good to see you again. I'm trying to think of the last time we saw and we saw each other in person. I don't remember it's been kind of a whirlwind california?
Speaker 2:yeah, I think it was california. I think it was one of the one of the event. Uh, or maybe taiwan, maybe taiwan. We were together in taiwan, yeah.
Speaker 1:That is true. Yes, taipei. Well, good to see you again and cool background once again. Thank you, Nice. You know this is all about introducing partners to the ecosystem and you know kind of the diversity of partners in our stack for the AGI Foundation, which is pretty cool. That being said, most people know who NXP is, but if you can kind of let us know, you're head of AI Strategy and Technologies at NXP and have been a long time supporter of the foundation, so I appreciate you coming this morning to talk. Maybe you can give us a little bit of. I'm not gonna ask you for the origin story of NXP, cause that would probably take the whole time, but, um, in terms of the entree of nxp into the ai space, I mean, you guys kind of got got into this a number of years ago, right, this isn't sort of a recent development for you right, um, it's um, I mean, I mean my, uh, my joining nxp is about 10 years.
Speaker 2:Uh, now, um, and we came in with an AI story from the startup that I was with that got acquired. So NXP has been in the whole AI space for quite some time let's say, in one of the early edge AI silicon vendors out there and more recent history. We're fundamentally a semiconductor processor vendor A lot of semiconductor for various markets. We play heavily in automotive, but also in industrial and IoT markets as well, and we provide a lot of processing solutions and SoCs and chips into those markets, as well as P-Mix and other sensor and analog components as well.
Speaker 2:But our main part of what I mostly focus on is the AI hardware and software enablement that we put into our processor devices and that's a wide portfolio, starting from the traditional microcontrollers, devices that have like Cortex-M cores running real-time OS and 100, 150 megahertz type of clock range on the CPU, and then scaling into more and more capable micros, into applications processors, with our imx lineup, which is now in its ninth generation. So we have imx lineup, which is now in its ninth generation. So we have imx nine applications device processors out there. And what you know we're very focused on is and we're seeing a demand in the market, that's why we're responding to it and building towards it is we're adding dedicated ai acceleration across that whole portfolio.
Speaker 2:So we have devices in the traditional MCU space 150 megahertz CPU, the Cortex M33, with an accelerator beside it. Or you have a slightly larger micro with more capable DSPs, with 2D GPUs, et cetera, again with a dedicated accelerator in our iMX RT700. And then you get into the imx applications processors. These are the Linux-capable multi-core Cortex-A, cortex-m mixed SOC with very rich GPU DSP and IO options. And again we started three, four years back in that family with the imx 8m plus to add a dedicated accelerator.
Speaker 2:And now a lot of that portfolio has some amount of dedicated ai acceleration. There is an npu in pretty much every device coming out in that family now um, currently so and of course that means that you got to build the software so that our developers can can leverage it. And that's where we're putting a lot of emphasis trying to. I like saying that we're not trying to make AI easy. We're just trying to make it easier with that software enablement, because it's not easy trying to take it from server grade, cloud grade, down into the edge. It's just about trying to make that transition smoother and add a lot more value on that edge.
Speaker 1:Yeah, and you're one of the few semiconductor companies that's really going from kind of edge to cloud. I mean, you're like you mentioned, the portfolio is quite broad, um, and so how do you see that sort of connection you mentioned about kind of going from the cloud to edge? I'm sort of somewhat familiar with that, coming from microsoft, um, and you know there's like frameworks like Onyx and things like that. I mean, do you see that there's a lot more? You know, when you think about these deployed solutions that are using AI, like what is the connection to the cloud or how do you think the cloud could be better leveraged?
Speaker 2:I mean, from our perspective, where NXP plays is on edge inference. I mean we, we don't really get into the training side of things. We think there's better devices and a lot more compute available on the cloud or on-prem servers for doing the, the typical ml training or ai training, um, and then, uh, you know, we play more on the inference side and even within that inference side our portfolio is more suitable for true edge deployments. It's not, you know, inference on a server grade type of device, it's inference on devices themselves, on the. You know, anything that's deployed in a vehicle, anything that's deployed in a smart home, smart building, smart factory, makes more sense for our portfolio. But that's not to say that we are edge versus cloud, and sometimes people mistake that you're either on the edge or you're on the cloud. But it seems like I've lost you on the audio side here.
Speaker 1:Yeah, there you go, I'm back. Sorry about that Boy, today's bad internet day Anyway.
Speaker 2:Did the recording keep going, or that was it sort of did then it died.
Speaker 1:I think you were at. We were talking about cloud to edge and what's the connection.
Speaker 2:Yeah, I can take it back. Yeah, I can take it back to. So we approach it as Our portfolio plays mostly on the inference side of the AI story, not on the training. Of course we do support a certain amount of on-device training and edge training capabilities where it's necessary, but a lot of training usually is done on higher compute with higher resource. You know memory availability and compute availability with higher resource. You know memory availability and compute availability.
Speaker 2:Even on the inference side of things, we're more focused on edge inference or true edge inference. We're running on the devices. You're running on resource-restricted both power or connectivity-restricted devices, potentially on smart home, smart factory, smart building type of deployments or even vehicle deployments. But we don't take an approach of edge versus cloud, that one is better than the other.
Speaker 2:A lot of times, edge deployments are about saving costs or creating higher security and privacy, meeting those higher requirements around security and privacy, or reducing costs around connectivity and deployment. And while achieving that, if you still have connectivity, that gives you a better potential for doing your ML ops perspective Like, you can maintain your models better, you can actually collect data from the edge and improve your overall model. So it's that hybrid approach of it's not edge versus cloud is edge can be better with cloud and the cloud. Moving from the cloud to the edge does have a lot of benefits to the end use cases typically, and that's what we're trying to make sure that we have a portfolio that meets the needs, as more and more of these workloads are moving to the, to the edge devices and the edge processors right no, that makes sense.
Speaker 1:And are you seeing any? Uh, especially for the more mpu class things that you were talking about are you seeing any whispers or beginnings of generative uh like transformer based models, things like that?
Speaker 2:definitely. I mean so we're. So we're already supporting those with our existing portfolio of devices. On the entire end. We have a device called iDatamix 9.5. And there we've deployed generative AI models, llms and VLMs. Typically they're sub 8 billion parameter size because you do still, with LLMs the dynamic has kind of shifted a bit. It's no longer compute restriction, it's more about memory restriction that we see and access rate restrictions that we see. So it's still holding right now that there's a bit of rough calculation that for each billion parameters that you have in an LLM you need to account for about a gigabyte of DDR memory. So if you have eight gigabytes of DDR you're not going to run more than seven, eight billion parameters and even then you need to be cognizant of your token time to first token and token generation rates. So we've built out enablement in terms of what we call the EIQ gen AI flow. So it's more of a methodology than a pure compiler or SDK type of tool when we're seeing also a shift in how our developers and users are using generative AI models themselves. So with CNNs, when we were doing, when the workload and this is still predominantly the market, by the way, for us when people are doing vision, ai models and solution creation with convolutional neural network models.
Speaker 2:They tend to maybe start with a model or use a model as a benchmark, but they typically do customize, they retrain. They tend to maybe start with a model or use a model as a benchmark, but they typically do customize, they retrain, they do some level of customization. They'll take a model off the shelf like a Yolo V5, v8, and then, you know, heavily modify. That sort of becomes a more of a custom model. That they'll take a mobile net and create a variant of it. Um, and the barrier to that type of customization is not very high in CNNs, but with LLMs it is quite high. So if you try to actually train or retrain an LLM, a multi-billion parameter model, you do need quite a bit of expertise in that domain. It is a more scarce resource in terms of the people that know how to do it. You need a lot more computer, you'll need a lot more data and a means to curate that data. So we're seeing the shift more from you know, taking a model, using it as a starting point, customizing it heavily and then deploying with a custom model in the market.
Speaker 2:Versus with lms, it's more about um, reusing a state-of-the-art known open source model or state-of-the-art known model with minimal retraining or even using methodologies like retrieval, augmented generation, rag to do, post training, context awareness to that model.
Speaker 2:Because we can't, in the markets that were active in industrial, automotive, medical you can't have your LLM hallucinate about results, so when you're using it it has to be fact-based responses. So we're seeing this being deployed, starting with conversational HMIs, so human-machine interfaces, in all of these markets, like automotive e-cockpits in medical device interactions, like automotive e-cockpits in medical device interactions, both with the um you know, medical practitioner, or maybe the, the patient or the the service receiver, um, as well as in industrial settings and smart factories with uh operators and machinery Um. So you're having this conversational HMI uh potential to more easily um you know work with the device without needing specific keywords or specific commands, and that's one of the first applications that is going to market that we're seeing on our devices with this technology. So LLMs. And then we're also seeing deployment of vision language models, so multimodal or VLM type of models getting used in scene understanding and being deployed very quickly into actual use cases.
Speaker 1:Right, right, yeah, no, it's fascinating. I mean, I think it's becoming definitely a hot topic. People are talking about you mentioned about HMI and where's the right fit and where's the right fit for generative AI on the edge.
Speaker 2:And so we're starting to see some of those use cases pop up beyond the cutting edge. It all started off and there were demos of image generation etc. And that doesn't make sense in a lot of the markets that we're in. But when you get into, like I said, conversational HMIs or scene understanding, situation awareness, I mean deploying smart not surveillance but smart monitoring systems for health and safety in a factory or industrial setting.
Speaker 1:That's getting quickly deployed now, yeah Well, providing context around a lot of sensor data and a lot of you know that can really help the human operator really understand what's going on. So that's pretty cool. So NXP is a worldwide company, I mean. So you're based in Canada, right? I'm based in Ottawa, canada.
Speaker 2:Part of a smaller office that we have up here about 50 people that we, like you said, we have a very large global presence.
Speaker 1:Yeah, so is the AI. You know I've met with a lot of companies. They sort of AI is sort of like almost like a virtual product line across many different geographies, and I mean, how do you think about it at NXP? Do you have sort of a center of excellence for AI that kind of works across groups, or how does that work?
Speaker 2:We have multiple formations, so we have multiple business lines. Where we have the business line that I'm officially part of called Secure Connected Edge. This is where we have our imx products. We have a business line that is advanced automotive analog and automotive systems um, that is more towards, you know, devices that go into the chassis of of vehicles, etc. So we have these two main business lines and both of them are very active in ai. So we do have um, both product innovation and software and hardware groups that service those two businesses. And then we do have a centralized CTO organization with an AI center of competence as well. That kind of works on very collaboratively with the business lines that I'm also part of, leading our innovation board. In that context that touches to the AI competence center that we run. So and we're very distributed globally. It's, I mean, talent is distributed. We're distributed as well in terms of both where the AI innovation happens, but as well as the actual R&D happens as well.
Speaker 1:Right. And then, yeah, speaking of organization, so NXP acquired Canara recently. We're not done yet. Not done yet.
Speaker 2:We're yeah, we're still waiting for regulatory approvals, but, yeah, we've announced it and we're working very hard to make sure that it happens in the first half this year. Yeah.
Speaker 1:Cool, well, we'll have to circle, working very hard to make sure that it happens in the first half this year. Cool Well, we'll have to circle back on that once the transaction closes. But yeah, no, I think it's a fascinating space and we're seeing so many. We have a lot of startups in the foundation too and a lot of companies that are kind of finding their niche in that stack from metal to cloud, and we're seeing some interesting sort of vertical integration. Definitely.
Speaker 2:For us, the Kinara acquisition, it's really very, very complimentary to what we had in the portfolio that I also described. So, getting into microcontrollers up to applications processors, we've had good success with our customers with the integrated native AI compute capacity that we were offering. But then some customers need more. There's always somebody that needs more than what you have. And that was already, even prior to the acquisition announcement, part of our scaling strategy of having options with external dedicated NPUs, dedicated AI accelerators to complement the product portfolio. And then we came at a point where you know Kinara had good technology and good you know penetration in terms of especially their you know their capabilities with LLMs and generative AI use cases. So just felt again that it accelerated NXP's timeline quite well in bringing that into the fold, provided that we complete that acquisition. And yeah, so it scales and becomes, and then the challenge shifts very quickly from two companies with two devices, kind of connecting over PCI or USB et cetera, to also integrating at the software level, which is the big challenge that we're right away trying to meet.
Speaker 1:Yeah, nxp, the corporate history of NXP is around a lot of kind of merging of companies together. Right, phillips and Freescale?
Speaker 2:Yeah, I mean, NXP is a spinoff of Phillips Right, Freescale was a spinoff of Motorola, the, the semiconductor, you know entities for both of those companies, and then the acquisition of Freescale by NXP, that kind of brought it all together. It was in 2015 timeframe. So two very big, you know fabulous semi companies coming together and then, yeah, you know, we've we've announced a few acquisitions this year and Kinara is directly in the AI, AI field.
Speaker 1:In that aspect, and I think, just before we wrap up, just want to say thank you for NXP's support of the edge AI foundation. I know that you guys are deeply involved. You have a you know, people know Davis Sawyer he's kind of famous for our live streams and involved in our blueprints working group, and Adam Fuchs in our data sets working group, and so you guys have really leaned in to help kind of provide leadership in the community too, which I think is really important, because this is definitely I mean, as you sort of inferred, this is a team sport in terms of collaboration and getting real solutions out into customers' hands. So I think it's important to have that kind of support.
Speaker 2:Exactly, yeah, definitely, we're very happy with our contributions and the collaboration that we have with the Edge AI Foundation and also in the last year. The shift from what used to be TinyML towards Edge AI also aligns very well with our portfolio portfolio. It kind of felt like we had to only talk about half our technology when we were trying to say, like, what is tiny and what isn't, but under the Edge AI Foundation banner, with the expanded focus that the foundation has, it's very complimentary to our position in the market and we feel like that's why, like you mentioned, I have colleagues that are very active in the foundation work, with the work groups, with the content, and we're looking forward to building more and more and working together with all the partners in the foundation, but with the foundation directly as well.
Speaker 1:Cool, no, really appreciate it. Well, ali, thanks for your time this morning. Uh, hopefully, uh, you're doing well there in Ottawa. I'm here in Bellevue, washington. I'm sure we will meet again in one of these things. Uh, the way it works. But uh, yeah, thanks a lot for your time. I look forward to it. Thanks, pete, cool, take care.