Submission declined on 1 March 2024 by HitroMilanese (talk). The proposed article does not have sufficient content to require an article of its own, but it could be merged into the existing article at Machine learning. Since anyone can edit Wikipedia, you are welcome to add that information yourself. Thank you.
Where to get help
How to improve a draft
You can also browse Wikipedia:Featured articles and Wikipedia:Good articles to find examples of Wikipedia's best writing on topics similar to your proposed article. Improving your odds of a speedy review To improve your odds of a faster review, tag your draft with relevant WikiProject tags using the button below. This will let reviewers know a new draft has been submitted in their area of interest. For instance, if you wrote about a female astronomer, you would want to add the Biography, Astronomy, and Women scientists tags. Editor resources
|
Embedded Machine Learning is the field of study where Embedded System and Machine Learning interact. Normally, Machine Learning Models consumes resources in terms of processing power, memory and interference speed during the training and inference phase. On the opposite, Embedded Systems such as microcontrollers, ECUs, wearable devices and edge devices have limited computing resources (memory, processor speed etc). Enabling such large models to run (Interference) on these devices is the main goal of this field. Various techniques such as hardware acceleration & model optimisation are used to achieve this goal.
ML models can be trained on larger computing systems like cloud or server but it is challenging when it comes to deploy (download/flash) that models on embedded devices. Another challenge to run that models (inference phase) on the embedded systems to make predictions without significant loss in accuracy.
Hardware based methods
editHardware acceleration techniques leverages specialized hardware components, such as Digital Signal Processors (DSPs), Graphics Processing Units (GPUs), Field-Programmable Gate Arrays (FPGAs), and dedicated Neural Network Accelerators (NNAs), to accelerate the inference process and improve the efficiency of embedded machine learning algorithms.
Software based methods
editSome Model Optimization (Compression) techniques are mentioned below which are used to compress/alter a ML model in such a way that the model take less space and make predictions faster without significant loss in accuracy.
Removes less important connections and parameters from the model results in reduced size and complexities.
Reduces the precision of parameters by using lower-bit representation (e.g., from 32 bits to 8 bits), leading to a smaller model size and faster inference.
It can be during the training phase (Quantization-aware training) or can be after training (post-training quantization).
Transfer knowledge from a large, pre-trained teacher network to a smaller student network, resulting in a smaller and efficient model with comparable accuracy.
Low-Rank Factorization
editDecomposes weight matrices into lower-rank factors, reducing the number of parameters without significant loss of accuracy.
Network Architecture Search (NAS)
editOptimizes network architectures for both accuracy and efficiency, potentially leading to a smaller models with high performance.
Parameter Sharing
editTo-do
Example Weight Sharing
References
edit- ^ Ajani, Taiwo Samuel; Imoize, Agbotiname Lucky; Atayero, Aderemi A. (January 2021). "An Overview of Machine Learning within Embedded and Mobile Devices–Optimizations and Applications". Sensors. 21 (13): 4412. Bibcode:2021Senso..21.4412A. doi:10.3390/s21134412. ISSN 1424-8220. PMC 8271867. PMID 34203119.
- ^ Ajani, Taiwo Samuel; Imoize, Agbotiname Lucky; Atayero, Aderemi A. (2021-06-28). "An Overview of Machine Learning within Embedded and Mobile Devices–Optimizations and Applications". Sensors. 21 (13): 4412. Bibcode:2021Senso..21.4412A. doi:10.3390/s21134412. ISSN 1424-8220. PMC 8271867. PMID 34203119.