Browse Items (5511 total)
Sort by:
-
Luminescence and energy storage characteristics of coke-based graphite oxide
The substantial escalation in both energy consumption and ecological crisis prompts the utilization of conventional pollution-causing energy resources towards a proficient mode of energy production and storage. The most polluting fossil fuel like coal possesses a highly ordered sp2 carbon clusters, that can be easily tailored into graphene derivatives promising for energy-related applications. However, the impact of crystallinity and quality of the precursor coke on the physicochemical characteristics of extracted carbon nanostructures need to be identified. Herein, we have prepared graphite oxide structures (GrO) from high-quality coal, coke via Improved Hummers' method eliminating the need for toxic NaNO3. The inherent defect states own by coke are also of high significance as it performs the role of various photoluminescence emission centers. The sp2 domains and different surface defects promote excitation independent and dependent luminescence substantiating the distinct multi-emission property of GrO. The extent of functionalization during the oxidation process has also significantly affected the thermal stability of the carbogenic structure. The symmetric galvanostatic charge-discharge curves and lower internal resistance present superior stability and fatser ion transport of as-synthesized GrO. A specific capacitance of 193F/g was obtained at 0.2A/g with excellent capacitance retention over 2500 cycles. The versatile attributes of the coke derived GrO validate its realizable optoelectronics and energy storage applications. 2020 Elsevier B.V. -
Lung cancer detection using image processing techniques
Lung cancer is one of the hazardous disease which leads to high death rates in the world. A cancer is an irregular growth of cells that can be characteristically derived from a single irregular cell and that may spread to whole part of the lung. So, it is necessary to find it at the earlier stages and take basic steps to cure.CT scan is one of the sensitive method used in the medical field for treating the patients. The quality of the image is very important for detection of lung cancer. Pre-processing of an image is a necessary process, as there is a difficulty in detecting cancer cells in an image due to the presence of noise and low-quality of images. To reduce the volume of these problems, diagnosis of lung cancer steps like image enhancement, image segmentation, feature extraction methods can be used. For processing and implementation of these methods Matlab tool has been used. This paper focuses on improving the quality of image and to optimise the work. Implementation is done using image processing toolbox that is available in Matlab tool.The whole idea of this research is to show the improved work in the existing system and to get more agreeable results. RJPT All right reserved. -
Lung cancer prediction with advanced graph neural networks
This research aims to enhance lung cancer prediction using advanced machine learning techniques. The major finding is that integrating graph convolutional networks (GCNs) with graph attention networks (GATs) significantly improves predictive accuracy. The problem addressed is the need for early and accurate detection of lung cancer, leveraging a dataset from Kaggle's "Lung Cancer Prediction Dataset," which includes 309 instances and 16 attributes. The proposed A-GCN with GAT model is meticulously engineered with multiple layers and hidden units, optimized through hyperparameter adjustments, early stopping mechanisms, and Adam optimization techniques. Experimental results demonstrate the model's superior performance, achieving an accuracy of 0.9454, precision of 0.9213, recall of 0.9743, and an F1 score of 0.9482. These findings highlight the model's efficacy in capturing intricate patterns within patient data, facilitating early interventions and personalized treatment plans. This research underscores the potential of graph-based methodologies in medical research, particularly for lung cancer prediction, ultimately aiming to improve patient outcomes and survival rates through proactive healthcare interventions. 2025 Institute of Advanced Engineering and Science. All rights reserved. -
Lung tuberculosis detection using x-ray images
This research work is based on the various experiments performed for the detection of lung tuberculosis using various methods like filtering, segmentation, feature extraction and classification. The results obtained from these experiments are discussed in this paper. Lung tuberculosis is a bacterial infection that causes more deaths in the world than any other infectious disease. Two billion people are infected with tuberculosis all around the world. Lung tuberculosis is a disease caused by a bacteria known as Mycobacterium tuberculosis or Tubercle bacillus. This research work strives to identify methods by which patients, who require second opinion for an already identified result, can save a lot of money. Once we receive X-ray image an input, pre-processing methods like Gaussian filter, median filter is applied. These filters help to remove unwanted noise and aid to get fine textural features. The output obtained from this is taken as an input and applied to water shed segmentation and gray level segmentation which helps to focus on the lung area of the obtained results. Output from these segmentation methods is fused to get a Region of Interest (ROI). From the ROI, the statistical features like area, major axis, minor axis, eccentricity, mean, kurtosis, skewness and entropy are extracted. Finally, we use KNN, Sequential minimal optimization (SMO), simple linear regression classification methods to detect lung tuberculosis. The results obtained in this paper suggests KNN classifier performs well than the other two classifiers. Research India Publications. -
Lyrics of longing: Exploring the role of music in the lived experience of homesickness among college students
The study investigates the multifaceted role of music during homesickness among first-year college students in India. As compared to other mental health outcomes, homesickness is a relatively understudied phenomenon, yet noteworthy due to its direct association with depression and anxiety. Although empirical evidence about music highlights its therapeutic potential for managing stress and anxiety, few studies have explored its role in connection with homesickness. The data for this study were collected through semi-structured interviews with 10 students about their perception of using music during homesickness. Through interpretative phenomenological analysis, the emerging themes pointed to a mixed influence, highlighting the bittersweet nature of music during homesickness. While music validates feelings and boosts confidence and motivation, it also triggers restorative nostalgia and serves as an escape from confronting homesickness. Moreover, native songs fostered an appreciation for ones culture and helped students connect with their roots. The study contributes to understanding how music is a versatile tool for students dealing with homesickness, offering solace and potential challenges. It serves as a guide to future intervention studies that could explore musics long-term influences. Recognising the diverse ways students perceive and respond to music provides valuable insights for developing tailored interventions and support systems. The Author(s) 2024. -
m-quasi-?-Einstein contact metric manifolds
The goal of this article is to introduce and study the characterstics of m-quasi-?-Einstein metric on contact Riemannian manifolds. First, we prove that if a Sasakian manifold admits a gradient m-quasi-?-Einstein metric, then M is ?-Einstein and f is constant. Next, we show that in a Sasakian manifold if g represents an m-quasi-?-Einstein metric with a conformal vector field V, then V is Killing and M is ?-Einstein. Finally, we prove that if a non-Sasakian (?, )-contact manifold admits a gradient m-quasi-?-Einstein metric, then it is N(?)-contact metric manifold or a ?-Einstein. Kumara H.A., Venkatesha V., Naik D.M., 2022. -
Machine intelligence security : A methodological blend of fuzzy logic in industry 4.0 algorithms
The way things are made has changed a lot because of Industry 4.0. It has also led to a time with great technology and relationships. The paper discusses way to improve security in Machine Intelligence in the setting of Industry 4.0. The study uses a mix of methods to combine Fuzzy Logic with cutting-edge Industry 4.0 algorithms in order to deal with new hacking problems. Because fuzzy logic can deal with doubt and imprecision, it can be used to make current methods more reliable. This creates a complex and flexible security structure. The merger was carefully planned to make the methods for finding anomalies, reducing threats, and responding to incidents work better. The suggested method aims to make machine intelligence systems more resistant to complex cyber dangers by combining the best parts of Fuzzy Logic with Industry 4.0 algorithms. This study adds to the growing conversation about how to keep smart factory settings safe by focusing on a proactive and dynamic security model. The effects of this mix of methods could be felt in many different industries, making it possible to use advanced technologies in a safer and more reliable way in the age of Industry 4.0. 2024, Taru Publications. All rights reserved. -
Machine Learning and Artificial Intelligence Techniques for Detecting Driver Drowsiness
The number of automobiles on the road grows in lockstep with the advancement of vehicle manufacturing. Road accidents appear to be on the rise, owing to this growing proliferation of vehicles. Accidents frequently occur in our daily lives, and are the top ten causes of mortality from injuries globally. It is now an important component of the worldwide public health burden. Every year, an estimated 1.2 million people are killed in car accidents. Driver drowsiness and weariness are major contributors to traffic accidents this study relies on computer software and photographs, as well as a Convolutional Neural Network (CNN), to assess whether a motorist is tired. The Driver Drowsiness System is built on the Multi-Layer Feed-Forward Network concept CNN was created using around 7,000 photos of eyes in both sleepiness and non-drowsiness phases with various face layouts. These photos were divided into two datasets: training (80% of the images) and testing (20% of the images). For training purposes, the pictures in the training dataset are fed into the network. To decrease information loss as much as feasible, backpropagation techniques and optimizers are applied. We developed an algorithm to calculate ROI as well as track and evaluate motor and visual impacts. 2022 Boppuru Rudra Prathap et al., published by Sciendo. -
Machine Learning and Artificial Intelligence Techniques for Detecting Driver Drowsiness
The number of automobiles on the road grows in lock-step with the advancement of vehicle manufacturing. Road accidents appear to be on the rise, owing to this growing proliferation of vehicles. Accidents frequently occur in our daily lives, and are the top ten causes of mortality from injuries globally. It is now an important component of the worldwide public health burden. Every year, an estimated 1.2 million people are killed in car ac-cidents. Driver drowsiness and weariness are major con-tributors to traffic accidents this study relies on computer software and photographs, as well as a Convolutional Neural Network (CNN), to assess whether a motorist is tired. The Driver Drowsiness System is built on the Multi-Layer Feed-Forward Network concept CNN was created using around 7,000 photos of eyes in both sleepiness and non-drowsiness phases with various face layouts. These photos were divided into two datasets: training (80% of the images) and testing (20% of the images). For training purposes, the pictures in the training dataset are fed into the network. To decrease information loss as much as feasible, backpropagation techniques and optimizers are applied. We developed an algorithm to calculate ROI as well as track and evaluate motor and visual impacts. 2022, Industrial Research Institute for Automation and Measurements. All rights reserved. -
Machine Learning and Deep Learning Analysis of Vehicle Carbon Footprint
Clearly climate change is one of the most significant hazards to mankind nowadays. And daily the situation has become worse. No other way characterises climate change except through changes in the patterns of temperature and weather. Human activity generates the primary greenhouse gas emissions. Among these activities are burning coal, oil, natural gas, as well as other fuels; agricultural techniques, industrial operations, deforestation, burning coal, oil. Mostly resulting from human activities, the average temperature of the planet has significantly increased by almost 1.1 degrees Celsius since the late 1800s. One theory holds that internal combustion engines affect roughly thirteen percent. The objective of this work is to do an analysis of a complicated dataset involving fuel consumption in urban and highway environments as well as mixed combinations since the relevance of these variables in modelling attempts dictates. Reduced CO2 emissions emissions and environmental impact follow from reduced fuel use. The project used numerous machine learning and deep learning approaches to comprehend data analysis. Moreover, this work investigates the dataset to acquire knowledge and concurrently solves problems such overfitting and outliers. Control of complexity is achieved using several methods like VIF, PCA, and Cross-Validation. Models combining CNN and RNN performed really well with an accuracy of 0.99. The R-squared metrics are utilized in order to do the evaluation of the model. Apart from linear regression, support vector machines, Elastic Net with a rewardable accuracy, random forest was applied. It has rather good 0.98 accuracy. We can therefore state that our model analyzed the data properly and generated accurate output since the results we obtained during the assessment phase exactly the same ones we obtained during the training stage. Mass data cleansing is required as well as further study to increase machine learning model accuracy and performance. 2024 The authors. -
Machine Learning Based Optimal Feature Selection for Pediatric Ultrasound Kidney Images Using Binary Coati Optimization
Chronic kidney disease (CKD) one of the most dangerous illnesses. Early detection is vital for improving survival rates and underscoring the need for an intelligent classifier to differentiate between normal and abnormal kidney ultrasound images. Features extracted from an image have a significant impact on classification accuracy. In this study, we present a Binary Coati optimization algorithm (BCOA) for feature selection in CKD, which focuses on reducing the high dimensionality features extracted from ultrasound images, including GLCM, GLRLM, GLSZM, GLDM, NGTDM, and first order, by employing BCOA-S shaped and BCOA-V shaped transfer functions that convert BCOA from a continuous search space to a binary form, which helps in the selection of optimal features to improve the classification performance while reducing the feature dimensionality. The reduced feature was evaluated using six machine-learning classifiers: Random Forest, Support Vector Machine, Decision tree, K-nearest Neighbor, XG-boost, and Nae Bayes. The efficiency of the proposed framework was assessed based on accuracy, precision, recall, specificity, f1 score and AUC curve. BCOA-V outperformed in terms of accuracy, precision, recall, specificity, F1 score and AUC curve by 99%,100%,97%,100%, 98%, and 98%, respectively. This makes it a superior choice for CKD diagnosis and is a valuable tool for feature selection in medical diagnosis. (2024), (Intelligent Network and Systems Society). All rights reserved. -
Machine Learning Classifiers for Credit Risk Analysis
The modern world is a place of global commerce. Since globalization became popular, entrepreneurs of small and medium enterprises to large ones have looked up to banks, which have existed in various forms since antiquity, as their pillars of support. The risk of granting loans in various forms has significantly increased as a consequence of this, the businesses face financing difficulties. Credit Risk Analysis is a major aspect of approving the loan application that is done by analyzing different types of data. The goal is to minimize the risk of approving the loan for the Individuals or businesses who might not pay back on time. This research paper addresses this challenge by applying various machine learning classifiers to the German credit risk dataset. By evaluating and comparing the accuracy of these models to identify the most effective classifier for credit risk analysis. Furthermore, it proposes a contributory approach that combines the strengths of multiple classifiers to enhance the decision-making process for loan approvals. By leveraging ensemble learning techniques, such as the Voting Ensemble model, the aim is to improve the accuracy and reliability of credit risk analysis. Additionally, it explores tailored feature engineering techniques that focus on selecting and engineering informative features specific to credit risk analysis. 2024 Sudiksha et al., licensed to EAI. -
Machine Learning in Financial Distress: A Scoping Review
Predicting financial distress is crucial for stakeholders, policymakers, governments, and management in decision-making processes. Researchers have developed various prediction models encompassing both traditional and machine-learning approaches. Notably, recent attention has shifted towards employing machine learning models to address the limitations of traditional methods. This study seeks to offer insights into current trends, identify gaps, and suggest future research directions using machine learning models for financial distress prediction, employing the PRISMA Extension for Scoping Reviews methodology. To achieve this, a comprehensive search was conducted across three databasesScience Direct, EBSCO, and ProQuestspanning from 2020 to 2023, identifying 34 relevant articles for analysis. The findings underscore the prevalent use of Support Vector Machine in financial distress prediction, followed by the Random Forest Classifier and Artificial Neural Network, with little attention paid to other models. Furthermore, the study underscores the necessity for more research in developing countries, noting the predominance of studies from developed nations. While machine learning models hold promise for enhancing the accuracy and efficiency of financial distress prediction, additional research is imperative to evaluate their effectiveness and applicability across diverse contexts. This scoping review aims to furnish researchers, policymakers, and institutions with valuable insights and policy recommendations, shedding light on underexplored machine-learning techniques. 2024, Iquz Galaxy Publisher. All rights reserved. -
Machine Learning Technique to Detect Radiations in the Brain
The brain of humans and other organisms is affected in various ways through the electromagnetic field (EMF) radiations generated by mobile phones and cell phone towers. Morphological variations in the brain are caused by the neurological changes due to the revelation of EMF. Cellular level analysis is used to measure and detect the effect of mobile radiations, but its utilization seems very expensive, and it is a tedious process, where its analysis requires the preparation of cell suspension. In this regard, this research article proposes optimal broadcasting learning to detect changes in brain morphology due to the revelation of EMF. Here, Drosophila melanogaster acts as a specimen under the revelation of EMF. Automatic segmentation is performed for the brain to attain the microscopic images from the prejudicial geometrical characteristics that are removed to detect the effect of revelation of EMF. The geometrical characteristics of the brain image of that is microscopic segmented are analyzed. Analysis results reveal the occurrence of several prejudicial characteristics that can be processed by machine learning techniques. The important prejudicial characteristics are given to four varieties of classifiers such as nae Bayes, artificial neural network, support vector machine, and unsystematic forest for the classification of open or nonopen microscopic image of D. melanogaster brain. The results are attained through various experimental evaluations, and the said classifiers perform well by achieving 96.44% using the prejudicial characteristics chosen by the feature selection method. The proposed system is an optimal approach that automatically identifies the effect of revelation of EMF with minimal time complexity, where the machine learning techniques produce an effective framework for image processing. This work is licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. -
Machine Learning Technology-Based Heart Disease Detection Models
At present, a multifaceted clinical disease known as heart failure disease can affect a greater number of people in the world. In the early stages, to evaluate and diagnose the disease of heart failure, cardiac centers and hospitals are heavily based on ECG. The ECG can be considered as a regular tool. Heart disease early detection is a critical concern in healthcare services (HCS). This paper presents the different machine learning technologies based on heart disease detection brief analysis. Firstly, Nae Bayes with a weighted approach is used for predicting heart disease. The second one, according to the features of frequency domain, time domain, and information theory, is automatic and analyze ischemic heart disease localization/detection. Two classifiers such as support vector machine (SVM) with XGBoost with the best performance are selected for the classification in this method. The third one is the heart failure automatic identification method by using an improved SVM based on the duality optimization scheme also analyzed. Finally, for a clinical decision support system (CDSS), an effective heart disease prediction model (HDPM) is used, which includes density-based spatial clustering of applications with noise (DBSCAN) for outlier detection and elimination, a hybrid synthetic minority over-sampling technique-edited nearest neighbor (SMOTE-ENN) for balancing the training data distribution, and XGBoost for heart disease prediction. Machine learning can be applied in the medical industry for disease diagnosis, detection, and prediction. The major purpose of this paper is to give clinicians a tool to help them diagnose heart problems early on. As a result, it will be easier to treat patients effectively and avoid serious repercussions. This study uses XGBoost to test alternative decision tree classification algorithms in the hopes of improving the accuracy of heart disease diagnosis. In terms of precision, accuracy, f1-measure, and recall as performance parameters above mentioned, four types of machine learning (ML) models are compared. Copyright 2022 Umarani Nagavelli et al. -
Machine Learning with Data Science-Enabled Lung Cancer Diagnosis and Classification Using Computed Tomography Images
In recent times, the healthcare industry has been generating a significant amount of data in distinct formats, such as electronic health records (EHR), clinical trials, genetic data, payments, scientific articles, wearables, and care management databases. Data science is useful for analysis (pattern recognition, hypothesis testing, risk valuation) and prediction. The major, primary usage of data science in the healthcare domain is in medical imaging. At the same time, lung cancer diagnosis has become a hot research topic, as automated disease detection poses numerous benefits. Although numerous approaches have existed in the literature for lung cancer diagnosis, the design of a novel model to automatically identify lung cancer is a challenging task. In this view, this paper designs an automated machine learning (ML) with data science-enabled lung cancer diagnosis and classification (MLDS-LCDC) using computed tomography (CT) images. The presented model initially employs Gaussian filtering (GF)-based pre-processing technique on the CT images collected from the lung cancer database. Besides, they are fed into the normalized cuts (Ncuts) technique where the nodule in the pre-processed image can be determined. Moreover, the oriented FAST and rotated BRIEF (ORB) technique is applied as a feature extractor. At last, sunflower optimization-based wavelet neural network (SFO-WNN) model is employed for the classification of lung cancer. In order to examine the diagnostic outcome of the MLDS-LCDC model, a set of experiments were carried out and the results are investigated in terms of different aspects. The resultant values demonstrated the effectiveness of the MLDS-LCDC model over the other state-of-The-Art methods with the maximum sensitivity of 97.01%, specificity of 98.64%, and accuracy of 98.11%. 2023 World Scientific Publishing Company. -
Machine Learning-Based Classification of Autism Spectrum Disorder across Age Groups
Autism Spectrum Disorder (ASD) is a complex neurodevelopmental condition that has gained significant attention in recent years due to its increasing prevalence and profound impact on individuals, families, and society as a whole. In this study, we explore the use of different machine learning classifiers for the accurate detection of ASD in children, adolescents, and adults. Furthermore, we conduct feature reduction to identify key features contributing to ASD classification within each age group using Cuckoo Search Algorithm. Logistic Regression has the highest accuracy compared to the other two models. 2024 by the authors. -
Machine Learning-Enabled NIR Spectroscopy. Part 3: Hyperparameter by Design (HyD) Based ANN-MLP Optimization, Model Generalizability, and Model Transferability
Data variations, library changes, and poorly tuned hyperparameters can cause failures in data-driven modelling. In such scenarios, model drift, a gradual shift in model performance, can lead to inaccurate predictions. Monitoring and mitigating drift are vital to maintain model effectiveness. USFDA and ICH regulate pharmaceutical variation with scientific risk-based approaches. In this study, the hyperparameter optimization for the Artificial Neural Network Multilayer Perceptron (ANN-MLP) was investigated using open-source data. The design of experiments (DoE) approach in combination with target drift prediction and statistical process control (SPC) was employed to achieve this objective. First, pre-screening and optimization DoEs were conducted on lab-scale data, serving as internal validation data, to identify the design space and control space. The regression performance metrics were carefully monitored to ensure the right set of hyperparameters was selected, optimizing the modelling time and storage requirements. Before extending the analysis to external validation data, a drift analysis on the target variable was performed. This aimed to determine if the external data fell within the studied range or required retraining of the model. Although a drift was observed, the external data remained well within the range of the internal validation data. Subsequently, trend analysis and process monitoring for the mean absolute error of the active content were conducted. The combined use of DoE, drift analysis, and SPC enabled trend analysis, ensuring that both current and external validation data met acceptance criteria. Out-of-specification and process control limits were determined, providing valuable insights into the models performance and overall reliability. This comprehensive approach allowed for robust hyperparameter optimization and effective management of model lifecycle, crucial in achieving accurate and dependable predictions in various real-world applications. Graphical Abstract: [Figure not available: see fulltext.]. 2023, The Author(s). -
Machine learningbased approaches for enhancing human resource management using automated employee performance prediction systems
Purpose: This study focuses on enhancing the accuracy and efficiency of employee performance prediction to enhance decision making and improve organisational productivity. By introducing advance machine learning (ML) techniques, this study aims to create a more reliable and data-driven approach to evaluate employee performance. Design/methodology/approach: In this study, nine machine learning (ML) models were used for forecasting employee performance: Random Forest, AdaBoost, CatBoost, LGB Classifier, SVM, KNN, XGBoost, Decision Tree and one Hybrid model (SVM + XGBoost). Each ML model is trained on an HR data set covering various features such as employee demographics, job-related factors and past performance records, ensuring reliable performance predictions. Feature scaling techniques, namely, min-max scaling, Standard Scaler and PCA, have been used to enhance the effectiveness of employee performance prediction. The models are trained to classify data, predicting whether an employees performance meets expectations or needs improvement. Findings: All proposed models used in the study can correctly categorize data with an average accuracy of 94%. Notably, the Random Forest model demonstrates the highest accuracy across all three scaling techniques, achieving optimise accuracy, respectively. The results presented have significant implications for HR procedures, providing businesses with the opportunity to make data-driven decisions, improve personnel management and foster a more effective and productive workforce. Research limitations/implications: The scope of the used data set limits the study, despite our models delivering high accuracy. Further research could extend to different data sets or more diverse organisational settings to validate the models effectiveness across various contexts. Practical implications: The proposed ML models in the study provide essential tools for HR departments, enabling them to make more informed data driven decisions with regard to employee performance. This approach can enhance personnel management, improve workforce productivity and fostering a more effective organisational environment. Social implications: Although AI models have shown promising outcomes, it is crucial to recognise the constraints and difficulties involved in their use. To ensure the fair and responsible use of AI in employee performance prediction, ethical considerations, privacy problems and any biases in the data should be properly addressed. Future work will be required to improve and broaden the capabilities of AI models in predicting employee performance. Originality/value: This study introduces an exclusive combination of ML models for accurately predicting employee performance. By employing these advanced techniques, the study offers novel insight into how organisations might transition from a conventional evaluation method to a more advanced and objective, data-backed approach. 2024, Emerald Publishing Limited. -
Machine LearningEnabled NIR Spectroscopy. Part 2: Workflow for Selecting a Subset of Samples from Publicly Accessible Data
Abstract: An increasingly large dataset of pharmaceuticsdisciplines is frequently challenging to comprehend. Since machine learning needs high-quality data sets, the open-source dataset can be a place to start. This work presents a systematic method to choose representative subsamples from the existing research, along with an extensive set of quality measures and a visualization strategy. The preceding article (Muthudoss et al. in AAPS PharmSciTech 23, 2022) describes a workflow for leveraging near infrared (NIR) spectroscopy to obtain reliable and robustdata on pharmaceutical samples. This study describes the systematic and structured procedure for selecting subsamples from the historical data. We offer a wide range of in-depth quality measures, diagnostic tools, and visualization techniques. A real-world, well-researched NIR dataset was employed to demonstrate this approach. This open-source tablet dataset (http://www.models.life.ku.dk/Tablets) consists of different doses in milligrams, different shapes, and sizes of dosage forms, slots in tablets, three different manufacturing scales (lab, pilot, production), coating differences (coated vs uncoated), etc. This sample is appropriate; that is, the model was developed on one scale (in this research, the lab scale), and it can be great to investigate how well the top models are transferable when tested on new data like pilot-scale or production (full) scale. A literature review indicated that the PLS regression models outperform artificial neural network-multilayer perceptron (ANN-MLP). This work demonstrates the selection of appropriate hyperparameters and their impact on ANN-MLP model performance. The hyperparameter tuning approaches and performance with available references are discussed for the data under investigation. Model extension from lab-scale to pilot-scale/production scale is demonstrated. Highlights: We present a comprehensive quality metrics and visualization strategy in selecting subsamples from the existing studies A comprehensive assessment and workflow are demonstrated using historical real-world near-infrared (NIR) data sets Selection of appropriate hyperparameters and their impact on artificial neural network-multilayer perceptron (ANN-MLP) model performance The choice of hyperparameter tuning approaches and performance with available references are discussed for the data under investigation Model extension from lab-scale to pilot-scale successfully demonstrated Graphical Abstract: [Figure not available: see fulltext.]. 2023, The Author(s).
