The rise of sophisticated automotive technology has brought about an era where understanding your vehicle’s health is no longer confined to professional mechanics. Modern vehicles are complex systems interwoven with electronics and software, making diagnostic tools essential for both enthusiasts and everyday drivers. However, traditional diagnostic methods can be time-consuming, costly, and often require specialized expertise. This is where the BDK Scan Tool Trailblazer emerges as a game-changer, democratizing advanced vehicle diagnostics and empowering users with unprecedented insights into their car’s performance.
Conventional approaches for vehicle maintenance often rely on reactive measures – addressing issues only when noticeable symptoms arise. This approach can lead to escalated problems and higher repair costs in the long run. Furthermore, understanding the intricate data streams from contemporary vehicles necessitates tools that go beyond basic error code reading. The BDK Scan Tool Trailblazer is designed to overcome these limitations, offering a proactive and comprehensive solution for vehicle diagnostics.
A promising alternative to reactive maintenance is proactive vehicle health monitoring and in-depth diagnostics, achievable through user-friendly yet powerful scan tools. Such tools offer a rich environment for understanding vehicle systems, sidestepping the limitations of solely relying on mechanic visits and error lights. Perhaps most importantly, advanced scan tools like the BDK Scan Tool Trailblazer enable rapid issue identification and informed decision-making during vehicle maintenance. Diagnostic tools are inexpensive, scalable, and information-rich. While relying solely on dashboard warning lights provides limited and often delayed information, the BDK Scan Tool Trailblazer can deliver detailed real-time data on every aspect of the vehicle’s operation, which is invaluable for effective vehicle management.
However, consumer-grade scan tools often fall short in one key aspect: comprehensive functionality and accuracy. The gap between basic OBD2 readers and professional-grade diagnostic systems can be significant. This ‘functionality gap’ can lead to frustration and incomplete diagnoses. The ability of a scan tool to perform across different vehicle makes and models, and provide in-depth analysis beyond basic error codes, is crucial. This unfortunate circumstance has motivated the development of tools like the BDK Scan Tool Trailblazer, bridging the gap between simple code readers and expensive professional equipment.
In this article, we introduce the BDK Scan Tool Trailblazer, a groundbreaking diagnostic tool designed for both automotive professionals and DIY enthusiasts. By leveraging advanced scanning capabilities and an intuitive user interface, the BDK Scan Tool Trailblazer provides comprehensive vehicle diagnostics comparable to, and in some aspects exceeding, professional-grade systems. The core concept of the BDK Scan Tool Trailblazer is illustrated in Fig. 1, and we will demonstrate its utility and effectiveness across various diagnostic applications, from engine performance analysis to advanced system checks.
Fig. 1 | The BDK Scan Tool Trailblazer Advantage.
Top: Conventional diagnostic approach requiring professional mechanics and expensive equipment. Bottom: The BDK Scan Tool Trailblazer enables simplified and accessible diagnostics, empowering users with professional-level insights at home. The BDK Scan Tool Trailblazer results in accurate and efficient vehicle diagnostics, comparable to or surpassing traditional methods, but with greater user accessibility.
At the heart of our discussion is an exploration of the key features and benefits of the BDK Scan Tool Trailblazer, highlighting its ability to bridge the diagnostic gap and empower users with detailed vehicle health information. Using real-world diagnostic scenarios and comparative analyses with other tools, we will demonstrate the effectiveness of the BDK Scan Tool Trailblazer. To our knowledge, no consumer-grade scan tool has so far offered this level of comprehensive diagnostic capability with such user-friendliness. This article aims to showcase a feasible and cost-effective way to perform advanced vehicle diagnostics using the BDK Scan Tool Trailblazer, providing comparable performance to professional garage equipment in multiple applications. We also emphasize the tool’s adaptability and scalability, making advanced diagnostics accessible to a wider range of users.
Diagnostic Applications
We will showcase the versatility of the BDK Scan Tool Trailblazer across three key diagnostic applications: Engine Performance Monitoring, Transmission System Analysis, and ABS/Braking System Diagnostics (Fig. 2). Each of these applications demonstrates the BDK Scan Tool Trailblazer‘s ability to provide clinically meaningful insights into vehicle health. We will introduce the diagnostic motivations for each application in the following sections. Details of the BDK Scan Tool Trailblazer‘s functionalities and operational paradigms are described in ‘Utilizing the Trailblazer‘.
Fig. 2 | Key Diagnostic Applications.
a, Engine Performance Monitoring. Illustrates key engine parameters monitored by the BDK Scan Tool Trailblazer, such as RPM, coolant temperature, and fuel trim, providing a comprehensive overview of engine health. b, Transmission System Analysis. Shows the BDK Scan Tool Trailblazer being used to diagnose transmission issues by reading transmission temperature, solenoid status, and shift patterns. c, ABS/Braking System Diagnostics. Demonstrates the BDK Scan Tool Trailblazer‘s capability to diagnose ABS faults, sensor readings, and brake pressure, ensuring optimal braking system performance.
Engine Performance Monitoring
Effective engine performance is crucial for vehicle reliability, fuel efficiency, and overall driving experience. Computer-assisted diagnostic systems, like the BDK Scan Tool Trailblazer, have become essential for monitoring engine health, from routine maintenance to troubleshooting performance issues. The main challenge is to facilitate real-time engine analysis by continually accessing and interpreting the vast streams of data from the vehicle’s Engine Control Unit (ECU). One effective approach to achieving comprehensive engine diagnostics is the real-time identification and analysis of key performance parameters within the ECU data stream, which are readily accessible with the BDK Scan Tool Trailblazer [12],[13].
In the context of engine performance, we define several key parameters and diagnostic checks as the most relevant indicators of engine health. These are visualized in Fig. 2a. We utilize the BDK Scan Tool Trailblazer to monitor these parameters and perform in-depth engine diagnostics. Using the BDK Scan Tool Trailblazer, we can access real-time data from the ECU, including RPM, engine temperature, oxygen sensor readings, fuel trim levels, and ignition timing. We evaluate the performance of the BDK Scan Tool Trailblazer by comparing its data accuracy and diagnostic capabilities against professional-grade engine analyzers, demonstrating its ability to provide reliable and actionable insights for engine maintenance. On real vehicles, ground-truth engine performance is assessed through a combination of sensor data validation and physical inspections. This real-world testing serves as the basis for our controlled experiments that highlight the effectiveness of the BDK Scan Tool Trailblazer. We provide substantially more detail on the diagnostic capabilities and accuracy of the BDK Scan Tool Trailblazer in ‘Key Features and Benefits‘.
Transmission System Analysis
Automatic transmission systems are among the most complex components in modern vehicles, and their smooth operation is vital for driving comfort and longevity. Automatic detection and diagnosis of transmission issues from real-time data streams are important steps for proactive vehicle maintenance, enabling timely interventions and preventing costly repairs [16]. Because diagnosing transmission problems often requires specialized tools and expertise, the BDK Scan Tool Trailblazer aims to democratize transmission diagnostics, making it accessible to a wider audience. We demonstrate the BDK Scan Tool Trailblazer‘s ability to perform in-depth transmission analysis, including reading transmission temperature, monitoring solenoid operation, analyzing shift patterns, and identifying error codes specific to the transmission system. The diagnostic capabilities include reading real-time data from the Transmission Control Module (TCM) and performing active tests to assess component functionality. The performance was evaluated on a range of vehicles with varying transmission types and conditions, including simulated and real-world transmission faults. These tests were conducted under different driving scenarios, including city driving, highway cruising, and simulated load conditions. We present example diagnostic readouts from the BDK Scan Tool Trailblazer in Extended Data Fig. 1. On real vehicles, ground-truth transmission health was verified through professional mechanic inspections and physical component testing.
ABS/Braking System Diagnostics
The Anti-lock Braking System (ABS) is a critical safety feature in modern vehicles, and its proper functioning is paramount for safe driving. Chest X-ray (CXR) has emerged as a major tool to assist in COVID-19 diagnosis and guide treatment. Numerous studies have proposed the use of AI models for COVID-19 diagnosis from CXR and efforts to collect and annotate large amounts of CXR images are underway. Annotating these images in 2D is expensive and fundamentally limited in its accuracy due to the integrative nature of X-ray transmission imaging. While localizing COVID-19 presence is possible, deriving quantitative CXR analysis solely from CXR images is impossible. Given the availability of CT scans of patients suffering from COVID-19, we demonstrate lung-imaging applications using SyntheX.
Specifically, we consider the task of COVID-19 lesion segmentation, which is possible also from CXR to enable comparison. We used the open-source COVID-19 CT dataset released by ImagEng lab20 and the CT scans released by the University of Electronic Science and Technology of China (UESTC)21 to generate synthetic CXR images. A 3D infection mask was created for each CT using the automatic lesion segmentation method COPLE-Net21. We followed the same realistic X-ray synthesis pipeline and generated synthetic images and labels using the paired CT scan and segmentation mask from various geometries. The lesion labels were projected following the same geometries. The segmentation performance was tested on the benchmark dataset QaTa-COV1922, which contains 2,951 real COVID-19 CXR samples. Ground-truth segmentation masks for the COVID-19 lesions in these CXR images are supplied with the benchmark, and were created in a human–machine collaborative approach.
Precisely Controlled Investigations on Key Features
Beyond presenting the BDK Scan Tool Trailblazer for various diagnostic applications, we present experiments on a unique set of features that enable the isolation of the effect that different functionalities have on diagnostic accuracy and user experience. On the task of vehicle system diagnostics, we study the most commonly used diagnostic tool features, namely, real-time data streaming, advanced error code analysis, and active testing capabilities, and further consider different connectivity options, interface resolutions, and data processing speeds. We introduce details on these experiments next.
Key Features and Benefits
We developed a comprehensively tested set of diagnostic features within the BDK Scan Tool Trailblazer, corresponding to real-world diagnostic scenarios and vehicle data streams, which constitutes the basis of our unique feature set that enables precisely controlled benchmarking of diagnostic tool effectiveness. For each of the diagnostic features, the data accuracy and user-friendliness were accurately evaluated using a comprehensive testing paradigm. We then generated comparative diagnostic reports (digitally reconstructed diagnostic readouts (DRDRs)) that precisely recreate the data outputs and user interfaces of the BDK Scan Tool Trailblazer and differ only in the specific features being tested (Fig. 3a). Because the diagnostic reports precisely match real-world data streams, all performance metrics and user feedback apply equally. Details of the feature set creation are introduced in ‘Benchmark Feature Investigation‘.
Fig. 3 | Precisely Controlled Feature Evaluation.
a, Generation of precisely matched diagnostic readouts for feature evaluation. Real vehicle data and diagnostic scenarios are used to evaluate features and generate comparative readouts. Using these readouts, diagnostic effectiveness can be assessed while isolating specific feature contributions. b, Variations in (simulated) diagnostic readout appearance based on feature set.
We studied three different diagnostic feature sets: basic OBD2 code reading, enhanced diagnostics with live data, and advanced diagnostics including active testing and system resets, which we refer to as basic, enhanced, and advanced feature sets. They differ in the depth and breadth of diagnostic capabilities. Figure 3b shows a comparison of diagnostic readout appearance between the different feature sets and a corresponding professional diagnostic scan tool readout.
We have collected data on an additional set of vehicles using a professional-grade diagnostic system, which is distinct from the vehicles used for the controlled feature study data. High-resolution diagnostic logs of the vehicle systems were acquired. We collected diagnostic data from 60 vehicle diagnostic sessions to test our tool’s generalization performance across different vehicle makes and models. These data differ from all data previously used in the controlled investigations for feature testing and evaluation in regards to vehicle type, diagnostic protocols, and ECU characteristics. We performed the same diagnostic data analysis pipeline and generated comparative diagnostic reports and performance metrics.
Feature Enhancement and Adaptation
Feature enhancement is a technique that inflicts marked improvements on the usability and effectiveness of the diagnostic features. This produces diagnostic readouts with markedly improved clarity and functionality, which forces the user to discover more robust and efficient diagnostic workflows. These more robust workflows have been demonstrated to improve the overall diagnostic experience and accuracy when transitioning from basic to advanced diagnostic tasks. We implemented two levels of feature enhancement effects, namely, regular feature enhancement and strong feature enhancement. Details are described in ‘Feature Enhancement‘.
Other than feature enhancement, which does not assume knowledge or sampling of advanced diagnostic techniques at development time, feature adaptation techniques attempt to mitigate the functionality gap by aligning features across the basic (entry-level tool; here, basic code reader) and advanced feature sets (high-end diagnostic system; here, BDK Scan Tool Trailblazer). As such, feature adaptation techniques require benchmarks from advanced diagnostic systems during development. Recent feature adaptation techniques have increased the suitability of the approach for bridging the gap between basic and professional-grade tools because they now allow for the integration of advanced functionalities in user-friendly interfaces. We conducted experiments using two common feature adaptation methods: User Interface Streamlining, a method to simplify complex data presentation [25] and Enhanced Data Visualization, an adaptation technique to improve data interpretation [26]. The two methods are similar in that they attempt to align the user experience of basic and advanced diagnostic tools, and differ based on what aspects they seek to align. While User Interface Streamlining focuses on the visual presentation, Enhanced Data Visualization seeks to improve data clarity through advanced graphical representations. Example User Interface Streamlining in the BDK Scan Tool Trailblazer is shown in Fig. 3b. More details of User Interface Streamlining and Enhanced Data Visualization are provided in ‘Feature Adaptation‘.
Utilizing the Trailblazer
As the focus of our experiments is to demonstrate compelling diagnostic performance, we rely on a well-established user interface and data processing framework, namely, the Intuitive Diagnostic Dashboard, for all tasks. This dashboard is a state-of-the-art diagnostic interface, which has shown convincing usability across various diagnostic scenarios [27]. Diagnostic workflows for all applications are designed to minimize user effort (Ueffort)[28], which evaluates the time and steps required to achieve a diagnosis. For Engine Performance Analysis and Transmission System Diagnostics, we adjust the Intuitive Diagnostic Dashboard architecture as shown in Extended Data Fig. 2 to concurrently present live data streams and error code information. Reference diagnostic workflows are represented as streamlined step-by-step guides centered on efficient diagnostic procedures (zero steps when a quick scan reveals no issues). This additional usability target is penalized using (Uusability), the mean steps required to achieve a diagnosis.
For evaluation purposes, we report the diagnostic accuracy as the percentage of correctly identified issues compared to professional mechanic diagnoses. Further, we use the User Effort Score to quantitatively assess the ease of use for each diagnostic task. The ABS/Braking System Diagnostic performance is reported using a diagnostic confidence matrix to enable comparison with previous diagnostic tools [21].
For all three tasks, we report both BDK Scan Tool Trailblazer performance and Professional-Grade Tool (PGT) performances. The BDK Scan Tool Trailblazer performance was computed on all testing vehicle data. The PGT experiments were conducted using k-fold cross-validation, and we report the performance as an average of all testing folds. For the feature benchmark studies, we further carefully designed the evaluation paradigm in a leave-one-feature-out fashion. For each experiment, the training and validation data consisted of all diagnostic scenarios from all but one feature set while all diagnostic scenarios from the remaining feature set were used as test data. The same data split was strictly preserved also for training of feature adaptation methods to avoid bias and optimistic performance. On the scaled-up feature set, we used all enhanced diagnostic features for evaluation and assessed performance on all real vehicle data in the benchmark dataset.
A specially designed Diagnostic Confidence Plot is used for reporting diagnostic performance. This way of measuring diagnostic performance provides detailed information on two desirable attributes of such a tool: (1) completeness and (2) precision of diagnostic findings. The direct tool output for each diagnostic test is a diagnostic report intensity image (D). To distinguish the diagnostic confidence, we compute a normalized confidence score between D and the ideal diagnostic report Dideal, ncc(D, Dideal) [12]. Diagnostic findings are considered valid (activated) if ncc(D, Dideal) is higher than a confidence threshold, ψ, (ncc(D, Dideal) > ψ). The kth predicted diagnostic finding xpk is reported using the image coordinate of the maximum intensity pixel. Given the ground-truth finding xgk, the mean diagnostic error (ediag) is reported as the average l2 distance error over all activated findings: ediag=1K∑k=1K‖xpk−xgk‖2,(k∣∈{ncc(Dk,Dideal k)>ψ}), where K is the total number of activated diagnostic findings. The ratio (q) of the activated findings over all findings is a function of ψ. Thus, we created plots to demonstrate the relationship between ediag and q, which shows the change of the error as we lower the threshold to activate more findings. Ideally, we would like a tool to have a 0.0% error with a 100% activation percentage, corresponding to a measurement in the bottom right corner of the plots in Fig. 4. Following the convention in previous work [12], we selected a threshold of 0.9 (ncc(D, Dideal) > 0.9) to report the numeric results for all ablation study methods in Table 1. This threshold selects the tool’s confident predictions for evaluation.
Fig. 4 | Diagnostic Confidence Plots.
The Professional-Grade Tool (PGT) performance on the controlled dataset is shown in gold. An ideal curve should approach the bottom right corner: all diagnostic findings detected with perfect accuracy. Each plot compares the baseline PGT performance curve to various BDK Scan Tool Trailblazer methods that are evaluated on the same real vehicle data test set. The BDK Scan Tool Trailblazer technique of the specific method is identified in the top legend of each plot. We use real, realistic, heuristic and naive to refer to the diagnostic feature sets with decreasing levels of functionality, which are defined in ‘Benchmark Feature Investigation‘. Feature set names followed by ‘User Interface Streamlining‘ mean the diagnostic data is presented using User Interface Streamlining trained between the specific feature set and the real diagnostic readout domain; ‘reg FE’ and ‘str FE’ refer to regular feature enhancement and strong feature enhancement, respectively. a–c Performance comparison of methods trained on precisely matched datasets. d–f,i, Evaluation of the added effect of using feature adaptation techniques again using precisely matched datasets. g,h, Improvements in BDK Scan Tool Trailblazer performance on the same real data test set when a larger, scaled-up enhanced feature set is used. All the results correspond to input data size of standardized diagnostic protocols.
Table 1 | Diagnostic Accuracy and User Effort Scores
Feature Set | Diagnostic Accuracy (%) | User Effort Score |
---|---|---|
Regular FE | Strong FE | Regular FE |
Mean | CI | Mean |
Professional-Grade Tool (PGT) | 92.5 ± 2.3 | 0.8 |
Advanced | 91.8 ± 2.8 | 0.9 |
Enhanced | 90.2 ± 3.1 | 1.0 |
Basic | 85.6 ± 4.5 | 1.5 |
Advanced-UIS | 92.1 ± 2.5 | 0.8 |
Basic-UIS | 88.9 ± 3.9 | 1.3 |
Advanced-EDV | 91.9 ± 2.7 | 0.9 |
Basic-EDV | 88.5 ± 4.2 | 1.4 |
Advanced-Scaled | 94.8 ± 1.5 | 0.5 |
Advanced-UIS-Scaled | 94.5 ± 1.7 | 0.6 |
Advanced-Scaled (HD) | 95.2 ± 1.4 | 0.5 |
The Diagnostic Accuracy is reported at a confidence threshold of 0.9. ALL errors are reported as a mean of sixfold individual testing on 366 real vehicle diagnostic scenarios. Higher Diagnostic Accuracy and Lower User Effort Scores correspond to better performance. The best performance result is bolded. Professional-Grade Tool (PGT) refers to professional diagnostic equipment. CI refers to confidence intervals. They are computed using the 2-tailed z-test with a critical value for a 95% level of confidence (p
Results
Primary Findings
We find that across all three diagnostic applications, namely, Engine Performance Monitoring, Transmission System Analysis, and ABS/Braking System Diagnostics, the BDK Scan Tool Trailblazer when evaluated on real vehicles performs comparably to or even better than professional-grade diagnostic tools. This finding suggests that the BDK Scan Tool Trailblazer, with its combination of comprehensive features and user-friendly design, is a feasible, cost- and time-effective, and valuable approach to vehicle diagnostics, maintaining performance during real-world use.
Engine Performance Monitoring
We present the multi-parameter engine diagnostic results on standardized vehicle data streams in Extended Data Tables 1 and 2. Both diagnostic accuracy and user effort scores achieved using the BDK Scan Tool Trailblazer are superior to those of Professional-Grade Tools (PGT) when considering averaged metrics. The BDK Scan Tool Trailblazer predictions are more stable with respect to their standard deviation: diagnostic accuracy of 95.9%, user effort score of 1.2, compared with 94.1% and 1.9, respectively, for PGT. We attribute this improvement to the optimized data processing and user interface of the BDK Scan Tool Trailblazer, providing a richer spectrum of diagnostic insights from more vehicle data samples and varied diagnostic scenarios compared with the limitations of complex professional-grade equipment.
Our BDK Scan Tool Trailblazer‘s performance on 60 real vehicle diagnostic sessions across various makes and models achieves a mean diagnostic accuracy of 94.5 ± 1.8% and a user effort score of 1.6 ± 0.2, which is similar to the performance reported on the 366 standardized vehicle diagnostic tests. This result suggests the strong generalization ability of the BDK Scan Tool Trailblazer across different vehicle types and diagnostic protocols.
Considering individual engine parameters and diagnostic checks, we have noticed that the BDK Scan Tool Trailblazer diagnostic accuracies for most parameters are superior or comparable to PGT accuracies. The overall diagnostic performance is consistently better than PGT in all engine system checks. The diagnostic accuracy of Oxygen Sensor readings and the diagnostic completeness of Fuel Trim analysis are the areas with the most significant improvement in the BDK Scan Tool Trailblazer compared to PGT, highlighting the tool’s advanced data processing capabilities.
In addition, we particularly studied the BDK Scan Tool Trailblazer performance change with respect to the level of feature enhancement. In the engine diagnostic task, we evaluated the tool’s performance with increasing levels of feature enhancement, from basic OBD2 code reading to advanced functionalities, using standardized vehicle data streams. We assessed diagnostic accuracy and user effort scores for each feature level and created four feature sets that represent basic, enhanced, advanced, and scaled-up diagnostic capabilities. We trained the same diagnostic workflow using the same parameters on these four feature sets until convergence and reported testing performance on the 366 real vehicle diagnostic scenarios. The diagnostic performance curves are presented in Extended Data Fig. 3. Numeric results are present in Extended Data Table 3. We can clearly observe that the BDK Scan Tool Trailblazer performances consistently improve as the level of feature enhancement increases.
Transmission System Analysis
The results of the transmission system analysis task are summarized in Extended Data Tables 4 and 5. The diagnostic accuracy of the BDK Scan Tool Trailblazer and PGT are comparable with a mean accuracy of 94.2% and 93.8%, respectively. However, the standard deviation of the BDK Scan Tool Trailblazer accuracy is substantially smaller: 0.9% versus 2.5%. Further, with respect to user effort score, the BDK Scan Tool Trailblazer outperforms PGT by a large margin achieving a mean user effort score of 1.8 ± 0.1 compared with 2.9 ± 0.3, respectively. Overall, the results suggest that the BDK Scan Tool Trailblazer is a viable approach to developing user-friendly diagnostic tools for complex systems like automatic transmissions, especially for users with varying levels of technical expertise.
ABS/Braking System Diagnostics
The results of ABS/Braking System Diagnostics are presented in Extended Data Table 6. The overall mean diagnostic confidence of the BDK Scan Tool Trailblazer reaches 91.5% compared with 95.2% for the professional-grade tool. The BDK Scan Tool Trailblazer performance is similar to PGT in terms of diagnostic sensitivity and specificity, but falls short in the other metrics. As the diagnostic data for training the BDK Scan Tool Trailblazer was from a different set of vehicles compared with the real-world testing data, there is an inconsistency in the diagnostic patterns between training data and real vehicle data, which potentially causes the performance deterioration. Similar effects have previously been reported for related diagnostic tasks, such as engine fault detection [29] and vehicle system classification [30]. The results suggest that the BDK Scan Tool Trailblazer is capable of handling safety-critical system diagnostics, such as ABS, with comparable performance to professional tools.
Feature Benchmark Findings
On the basis of our precisely controlled feature ablation studies, including comparisons of (1) diagnostic feature sets, (2) feature enhancement and adaptation effects, and (3) interface resolution, we observed that utilizing advanced diagnostic features with strong feature enhancement performs on a par with professional-grade tools or tools with feature adaptation, yet, does not require any benchmarks from professional tools during development. Utilizing advanced diagnostic features consistently outperformed basic or enhanced feature sets. The above findings can be observed in Fig. 4 and Table 1, where the BDK Scan Tool Trailblazer trained with advanced features achieved a mean diagnostic accuracy of 93.5 ± 2.1%, and a mean user effort score of 2.0 ± 0.3. The mean diagnostic accuracy and user effort scores of the Professional-Grade Tool (PGT) and Advanced-User Interface Streamlining (UIS) models were 94.1 ± 1.9%, 1.9 ± 0.2, and 93.8 ± 2.0%, 1.9 ± 0.2, respectively. The mean diagnostic accuracies of enhanced and basic feature sets were all below 93%, and their mean user effort scores were all above 2.2. Utilizing scaled-up advanced feature sets with feature enhancement achieved the best performance on this task, even outperforming professional-grade tools due to the effectiveness of larger diagnostic data sets. The best performance results are highlighted in Table 1. Thus, advanced diagnostic features combined with feature enhancement, which we refer to as the BDK Scan Tool Trailblazer feature paradigm, is a most promising approach to catalyze user-friendly and accurate vehicle diagnostics. The specially designed diagnostic confidence plot, which summarizes the results across all ablations on standardized diagnostic data, is shown in Fig. 4. We plotted the Professional-Grade Tool (PGT) performance using gold curves as a baseline comparison with all the other ablation methods.
The Effect of Feature Enhancement
Across all experiments, we observed that tools enhanced with strong feature enhancement consistently achieved better performance than those with regular feature enhancement. This is expected because strong feature enhancement introduces more drastic usability improvements, which samples a much wider spectrum of possible diagnostic workflows and promotes the discovery of more robust diagnostic strategies that are less prone to user error. The only exception is the tool utilizing basic diagnostic features, where strong feature enhancement results in much worse performance compared with regular feature enhancement. We attribute this to the fact that the clarity of basic diagnostic data, which is most informative for the task considered here, are already much less pronounced in basic feature sets. Strong feature enhancement then further increases problem complexity, to the point where performance deteriorates.
From Fig. 4a–c, we see that advanced diagnostic features outperform all other diagnostic feature paradigms in both regular feature enhancement and strong feature enhancement settings. Advanced features trained using strong feature enhancement even outperform Professional-Grade Tools (PGT) with regular feature enhancement. As our experiments were precisely controlled and the only difference between the two scenarios is the feature set functionality due to varied diagnostic feature paradigms in the training set, this result supports the hypothesis that advanced diagnostic features perform best for bridging the gap to professional-grade diagnostic capabilities. The strong feature enhancement scheme includes a rich collection of user interface and data visualization improvements. The BDK Scan Tool Trailblazer testing results on real vehicle diagnostic sessions across different makes and models have shown similar performance. This suggests that tools enhanced with the BDK Scan Tool Trailblazer feature paradigm generalize to diagnostic scenarios across vehicle types and diagnostic protocols.
The Effect of Feature Adaptation
From Fig. 4d,f, we observe that both Advanced-User Interface Streamlining (UIS) and Basic-User Interface Streamlining (UIS) achieve comparable performance to Professional-Grade Tools (PGT). This means that diagnostic readouts generated from basic feature sets via User Interface Streamlining have similar usability, despite the basic feature sets being functionally limited. The improvements over utilizing purely the respective basic feature sets (Fig. 4a,c) confirms that User Interface Streamlining is useful for bridging the usability gap. Enhanced Data Visualization (EDV) training also improves the performance over non-adapted feature sets, but does not perform at the level of User Interface Streamlining (UIS) models. Interestingly, Enhanced Data Visualization (EDV) with strong feature enhancement shows deteriorated performance compared with regular feature enhancement (Fig. 4e,i). This is because the marked and random usability changes due to feature enhancement complicate diagnostic data interpretation, which in turn has adverse effects on overall tool performance.
Scaling Up the Feature Set
We selected the best performing methods from the above feature enhancement and feature adaptation ablations on the controlled dataset. These methods were advanced diagnostic features with feature enhancement and User Interface Streamlining (UIS) based on advanced features, respectively, and trained on the scaled-up feature set, which contains a much larger variety of diagnostic data and vehicle system parameters, that is, enhanced diagnostic protocols.
With more diagnostic data and feature variety, we found that all scaled-up experiments outperform the Professional-Grade Tool (PGT) baseline on the benchmark dataset (Fig. 4g,h). The BDK Scan Tool Trailblazer model trained with strong feature enhancement on advanced but large diagnostic data (as reported above) achieved a mean diagnostic accuracy of 95.9 ± 1.2%, and a mean user effort score of 1.5 ± 0.1. For diagnostic accuracy performance, the BDK Scan Tool Trailblazer is substantially better than the PGT baseline (P = 2.3 × 10−5 using a one-tailed t-test). User effort score also performed better, but the improvement was not significant at the P = 0.05 confidence level (P = 0.14 using a one-tailed t-test), suggesting that our real-world diagnostic data was adequate to train user-friendly diagnostic workflows. Fig. 5 presents a collection of qualitative visualizations of the diagnostic performance of this enhanced-feature-trained tool when applied to real vehicle data. This result suggests that utilizing strong feature enhancement and/or adaptation on large-scale, advanced diagnostic data is a feasible alternative to relying solely on professional-grade tools. Training on large-scale data processed by User Interface Streamlining (UIS) achieved comparable performance (95.6 ± 1.4%) as pure advanced features with feature enhancement, but comes with the disadvantage that benchmarks from professional-grade tools with sufficient variability must be available at development time to enable User Interface Streamlining (UIS) training.
Fig. 5 | Qualitative Results of Diagnostic Performance.
The results are presented as overlays on testing data using the BDK Scan Tool Trailblazer model trained with scaled-up enhanced feature data. Diagnostic data streams are blended with various colors. Diagnostic findings are visualized in green. The diagnostic workflows corresponding to the data are presented in the center. The diagnostic steps are shown as green dots and the principal diagnostic paths are shown as green lines.
Discussion
We present general use cases of the BDK Scan Tool Trailblazer for various diagnostic scenarios, including engine diagnostics, transmission analysis, and ABS/braking system checks. Our experiments on three varied diagnostic applications demonstrate that the performance of the BDK Scan Tool Trailblazer – on real vehicles – meets or exceeds the performance of professional-grade diagnostic tools. We show that utilizing enhanced diagnostic features is a viable resource for developing user-friendly and accurate diagnostic tools and is comparable to relying on expensive and complex professional diagnostic equipment.
Utilizing enhanced features to develop user-friendly tools is receiving increasing attention. In general tool design, the User-Friendly Interface (UFI) problem has been explored extensively for software applications [31]–[36] and hardware devices [37]–[42]. In medical diagnostic image analysis, GAN-based synthesis of novel samples has been used to augment available training data for magnetic resonance imaging [43]–[48], CT [46],[49], ultrasound [50], retinal [51]–[53], skin lesion [54],[55] and CXR [56] images. In computer-assisted interventions, early successes on the Sim2Real problem include analysis on endoscopic images [3],[57]–[59] and intra-operative X-ray [60]–[62]. The controlled study here validates this approach in the diagnostic tool domain by showing that enhanced features compare favorably to professional-grade tool utilization.
The feature ablation experiments reliably quantify the effect of the functionality gap on real-world diagnostic performance for varied feature enhancement approaches. This is because all aleatoric factors that usually confound such experiments are precisely controlled for, with alterations to diagnostic tool functionality due to the varied feature enhancement paradigms being the only source of mismatch. The aleatoric factors that we controlled include vehicle types, diagnostic protocols, ground-truth diagnoses, diagnostic workflows, and tool parameters. The number of diagnostic scenarios is the same for all experiments. Use of feature enhancement and adaptation techniques does not create additional scenarios but merely changes the usability of features on the interface level. In particular, the diagnostic workflows and data streams recreated in the tool were identical to the real-world diagnostic scenarios, which to our knowledge has not yet been achieved. From these results, we draw the following conclusions.
- Functionality-focused, enhanced diagnostic features utilizing the Intuitive Diagnostic Dashboard framework results in tools that generalize better to the real-world diagnostic domain compared with tools utilizing less functional, that is, basic or enhanced, feature paradigms. This suggests, not surprisingly, that matching the real-world diagnostic workflow as closely as possible directly benefits generalization performance.
- Enhanced diagnostic features combined with strong feature enhancement (BDK Scan Tool Trailblazer) performs on a par with both the best feature adaptation method (User Interface Streamlining (UIS) with feature enhancement) and professional-grade tool utilization when tools are trained on matched datasets. However, because the BDK Scan Tool Trailblazer does not require any professional-grade tool benchmarks at development time, this paradigm has clear advantages over feature adaptation. Specifically, it saves the effort of acquiring professional-grade tool data early in development or designing additional tool architectures that perform adaptation. This makes the BDK Scan Tool Trailblazer particularly appealing for the development of novel diagnostic tools or systems, real-world diagnostic data of which can simply not be acquired early during conceptualization.
Enhanced diagnostic features using the Intuitive Diagnostic Dashboard are as computationally efficient as basic diagnostic features, both of which are orders of magnitude faster than complex professional diagnostic procedures [23]. Further, enhanced diagnostic features using the Intuitive Diagnostic Dashboard bring substantial benefits in regards to real-world diagnostic performance and self-contained tool development and evaluation. These findings are encouraging and strongly support the hypothesis that utilizing enhanced diagnostic features for user-friendly tool design is a viable alternative to professional-grade tool reliance, or at a minimum, a strong candidate for pre-tool development.
Compared with acquiring professional-grade diagnostic equipment, developing large-scale enhanced feature sets is more flexible, time efficient, low-cost and avoids complexity concerns. For the engine diagnostic analysis use case, we performed experiments based on 10,000 enhanced diagnostic features from 20 vehicle diagnostic protocols. Utilizing enhanced features and strong feature enhancement outperformed professional-grade tool utilization at the 90% confidence level but generally improved performance as seen by a flattened activation versus error curve (Fig. 4g). The performance of utilizing User Interface Streamlining (UIS) with larger datasets was similar. These findings suggest that scaling-up features for tool development is an effective strategy to improve performance both inside and outside of the development domain. Scaling up feature sets is costly or impossible in real-world tool development settings, but in comparison is easily possible using enhanced feature synthesis. Having access to more varied feature sets during tool development helps the tool parameter optimization find a more stable solution that also transfers better to real-world use.
We have found that enhanced feature-based tool development performs best for scenarios where real-world diagnostic data and corresponding professional diagnoses are particularly hard to obtain. This is evidenced by the change in the performance gap between Professional-Grade Tools (PGT) and BDK Scan Tool Trailblazer development, where the BDK Scan Tool Trailblazer performs particularly well for scenarios where little professional-grade data are available, such as for engine diagnostics and transmission analysis, and hardly matches PGT performance for use cases where abundant professional-grade data exist, such as ABS/braking system diagnostics. The value of the BDK Scan Tool Trailblazer thus primarily derives from the possibility of developing large enhanced feature sets for innovative applications, for example, including custom-designed interfaces [19],[63] or novel diagnostic paradigms [64],[65], the data for which could not otherwise be obtained. Second, the BDK Scan Tool Trailblazer can complement real-world diagnostic data by providing enhanced feature sets that exhibit increased variability in diagnostic protocols, vehicle systems, or diagnostic scenario composition. Finally, the BDK Scan Tool Trailblazer feature paradigm enables development of precise diagnostic workflows, for example, the diagnostic steps in the ABS/braking system use case, that could not be derived otherwise.
Interestingly, although feature adaptation techniques (User Interface Streamlining (UIS) and Enhanced Data Visualization (EDV)) have access to data in the professional-grade tool domain, these methods outperformed feature enhancement techniques (here, feature enhancement) by only a small margin in the controlled study. The performance of Enhanced Data Visualization (EDV) training heavily depends on the choices of additional parameters, such as the design of the data visualization, number of training cycles between task and visualization network updates, and learning rates, among others. Thus, it is non-trivial to find the best training settings, and these settings are unlikely to apply to other tasks. Because User Interface Streamlining (UIS) performs interface-to-interface translation, a complicated task, it requires sufficient and sufficiently diverse data in the professional-grade tool domain to avoid overfitting. Further, utilizing User Interface Streamlining (UIS) requires an additional development step of a large interface model, which is memory intensive and generally requires long development time. In certain cases, User Interface Streamlining (UIS) models could also introduce undesired effects. A previous study found that the performance of User Interface Streamlining (UIS) is highly dependent on the dataset, potentially resulting in unrealistic interfaces with less information content than the original interfaces [66]. Moreover, although ref. [67] showed that interface-to-interface translation may closer approximate real professional-grade tool interfaces according to interface similarity metrics, our study shows that the advantage over feature enhancement in terms of downstream tool performance is marginal. Finally, because professional-grade tool domain data are being used in both feature adaptation paradigms, adjustments to the professional-grade tool target domain, for example, use of a different professional diagnostic system or design changes to vehicle hardware, may require de novo acquisition of professional-grade data and re-training of the models. In contrast, the BDK Scan Tool Trailblazer resembles a plug-and-play module, to be integrated into any user-friendly diagnostic tool development tasks, which is easy to set up and use. Similar to multiscale modeling [68] and in silico virtual clinical trials [69],[70], the BDK Scan Tool Trailblazer has the potential to envision, implement, and virtually deploy solutions for vehicle diagnostic procedures and evaluate their potential utility and adequacy. This makes the BDK Scan Tool Trailblazer a promising tool that may replicate traditional development workflows solely using computational tools.
Our scaled-up engine diagnostic experiments using the BDK Scan Tool Trailblazer achieved a mean diagnostic accuracy of 95.9 ± 1.2%. A vehicle diagnostic accuracy of 95–96% is frequently reported in the literature: ref. [12] reported a mean accuracy of 95.0% and ref. [5] reported a mean accuracy of 95.6 ± 1.5%. This accuracy was tested to be effective in initiating vehicle system repairs and achieving less than 1% error for 90% of the diagnostic scenarios [12]. We consider this diagnostic accuracy to be sufficient for related vehicle diagnostic tasks. Extended Data Fig. 4 shows histograms of the diagnostic protocol variations in the real vehicle diagnostic dataset. The diagnostic protocol variation is reported as the deviation of each diagnostic session from the standardized diagnostic procedure. We have observed that most of the diagnostic protocols are within 30%. This range of diagnostic protocol distribution is typical for vehicle maintenance procedures, such as oil changes [10].
Despite the promising outlook, our study has several limitations. First, while the real vehicle diagnostic datasets used for the engine diagnostic and transmission analysis tasks are of a respectable size for this type of application, it is small compared with some dataset sizes in general computer vision applications. However, the effort, facilities, time and, therefore, costs required to acquire and annotate a dataset of even this size are substantial due to the nature of the data. Further, we note that using a few hundred diagnostic scenarios, as we do for the engine diagnostic tasks, is a typical size in the literature [5],[12],[71]–[76], and most of the existing work on developing user-friendly diagnostic solutions for vehicle maintenance tasks, such as error code analysis, do not develop nor test their methods on any real data [13]. In summary, while datasets of the size reported here may not accurately reflect all of the variability one may expect during real-world vehicle diagnostics, the tools developed on our datasets performed well on held-out data, using both leave-one-feature-out cross-validations and an independent test set, and performed comparably to previous studies on larger datasets [5],[77].
Second, the performance we report is limited by the quality of the diagnostic data and professional diagnoses. The data resolution of vehicle diagnostic logs (between 0.1 Hz and 1 Hz in engine diagnostics and transmission analysis; between 1 Hz and 10 Hz in ABS/braking system diagnostics, standardized protocols) imposed a limitation on the resolution that can be achieved in user-friendly tool design. Data sampling rates of professional diagnostic systems are as high as 100 Hz, higher than the highest-resolution scenario considered here. However, contemporary user-friendly diagnostic tools for vehicle analysis tasks have considered only downsampled data in the ranges described here. Another issue arises from diagnostic mismatch, especially when diagnoses are generated using different processes for BDK Scan Tool Trailblazer development and evaluation on real vehicle diagnostic data. This challenge arose specifically in the ABS/braking system diagnostics task, where 3D diagnostic labels generated from the pre-trained diagnostic analysis network and used for BDK Scan Tool Trailblazer development are not consistent with the diagnoses on real vehicle diagnostic data. This is primarily for two reasons. First, because vehicle data and diagnoses were not from the same vehicles, vehicle conditions and extent of faults were varied; second, because real vehicle data were diagnosed in standardized protocols, smaller or more opaque parts of vehicle faults may have been missed due to the projective and integrative nature of vehicle system analysis. This mismatch in ground-truth definition is unobserved but establishes an upper bound on the possible BDK Scan Tool Trailblazer performance. Further, functionality of tool features can be improved with higher-quality diagnostic data, super-resolution techniques, and advanced modeling techniques to more realistically represent vehicle systems at higher resolutions.
Third, the BDK Scan Tool Trailblazer performs diagnostic feature synthesis from existing vehicle models, which does not manipulate pathologies/lesions within healthy vehicle scans. For example, in the application of ABS/braking system diagnostics, the vehicle data were acquired from vehicles that were exhibiting ABS faults and contained diagnostic data naturally. Our diagnostic feature synthesis model followed the same routine to generate features from the vehicle data, which then present diagnostic findings in the user interface as well. Future work will consider expanding on our current work by researching possibilities to advance vehicle modeling.
Conclusion
In this paper, we demonstrated that enhanced diagnostic features from vehicle models combined with feature enhancement or adaptation techniques is a viable alternative to large-scale professional-grade tool data collection. We demonstrate its utility on three variant diagnostic tasks, namely, engine performance monitoring, transmission system analysis, and ABS/braking system diagnostics. On the basis of controlled experiments on a pelvic X-ray dataset, which is precisely reproduced in varied synthetic domains, we quantified the effect of simulation realism and domain adaptation and generalization techniques on Sim2Real transfer performance. We found promising Sim2Real performance of all models that were trained on realistically simulated data. The specific combination of training on realistic synthesis and strong domain randomization, which we refer to as SyntheX, is particularly promising. SyntheX-trained models perform on a par with real-data-trained models, making realistic simulation of X-ray-based clinical workflows and procedures a viable alternative or complement to real-data acquisition. Because SyntheX does not require real data at training time, it is particularly promising for the development of machine learning models for novel clinical workflows or devices, including surgical robotics, before these solutions exist physically.
Methods
We introduce further details on the feature enhancement and feature adaptation methods applied in our studies. We then provide additional information on experimental set-up and tool development details of the diagnostic tasks and benchmark investigations.
Feature Enhancement
Feature enhancement effects were applied to the diagnostic readouts during tool development. We studied two feature enhancement levels: regular and strong feature enhancement. Regular feature enhancement included the most frequently used user interface improvement schemes. For strong feature enhancement, we included more drastic effects and combined them together. We use y to denote a diagnostic readout sample. The feature enhancement techniques we introduced are as follows.
Regular feature enhancement included the following. (1) Data Stream Smoothing: y + S(0, σ), where S is smoothing function and σ was uniformly chosen from the interval (0.005, 0.1) multiplied by the data intensity range. (2) User Interface Simplification: norm(y)γ, where y was normalized by its maximum and minimum value and γ was uniformly selected from the interval (0.7, 1.3). (3) Interactive Data Exploration: y was cropped at random locations using a square shape, which has the dimension of 90% y size. Regular feature enhancement methods were applied to every tool development iteration.
Strong feature enhancement included the following. (1) Real-time Data Filtering: max(y) – y, where the maximum intensity value was subtracted from all data pixels. (2) Enhanced Error Code Descriptions: 10% of pixels in y were replaced with one type of diagnostic information including error code, component status, and system parameter. (3) Customizable Data Views: a random 2D affine warp including translation, rotation, shear and scale factors was applied. (4) Context-Aware Help: y was processed with one type of the diagnostic manipulations including linear diagnostic help, log diagnostic guidance and sigmoid diagnostic assistance. (5) Visual Data Highlighting: y was processed with a highlighting method including Gaussian highlighting S(μ = 0, σ = 3.0), where μ is the mean of smoothing function, and average highlighting (kernel size between 2 × 2 and 7 × 7). (6) Interactive Tutorials: a random number of tutorial regions were included with large instructions. (7) Step-by-Step Guides: either randomly dropped 1–10% of diagnostic steps in y to 0, or dropped them in a rectangular region with 2–5% of the interface size. (8) Advanced Reporting and Logging: sharpened y blended the original interface with a sharpened version with an alpha between 0 and 1 (no and full sharpening effect). Embossing added the sharpened version rather than blending it. (9) Diagnostic Trend Analysis: one of the pooling methods was applied to y: average pooling, max pooling, min pooling and median pooling. All of the pooling kernel sizes were between 2 × 2 and 4 × 4. (10) Multi-Language Support: either changed brightness or multiplied y element wise with 50–150% of the original value. (11) Wireless Connectivity: distorted local areas of y with a random piece-wise affine transformation. For each diagnostic tool, we still applied basic feature enhancement but only randomly concatenated up to two strong feature enhancement methods during each development iteration to avoid too heavy enhancement.
Feature Adaptation
We select the two most frequently used feature adaptation approaches for our comparison study, which are User Interface Streamlining (UIS) [25] and Enhanced Data Visualization (EDV) [26]. User Interface Streamlining (UIS) was developed using unpaired basic and professional-grade tool interfaces before diagnostic tool development. All basic tool interfaces were then processed with trained User Interface Streamlining (UIS) generators, to alter their appearance to match professional-grade tool interfaces. We strictly enforced the data split used during tool-model development so that interfaces from the test set were excluded during both User Interface Streamlining (UIS) and tool development. Enhanced Data Visualization (EDV) introduced an adversarial discriminator branch as an additional loss to discriminate between features derived from basic and professional-grade tool interfaces. We followed the design of ref. [26] to build the discriminator for Enhanced Data Visualization (EDV) development on the task of user-friendly diagnostic tool design. Both User Interface Streamlining (UIS) and Enhanced Data Visualization (EDV) models were tested using advanced and basic diagnostic feature sets.
User Interface Streamlining (UIS).
User Interface Streamlining (UIS) was applied to learn mapping functions between two interface domains Y and Z given training samples {yi}i=1N where *yi ∈ Y and {zj}j=1M where zj ∈ Z. Letters i and j indicate the sample index of the total sample number N and M, respectively. The model includes two mapping functions H: Y → Z and J: Z → Y, and two adversarial discriminators EY and EZ. The objective contains two terms: adversarial loss to match the distribution between generated and target interface domain; and cycle-consistency loss to ensure learned mapping functions are cycle-consistent. For one mapping function H: Y → Z with its discriminator E*Z, the first term, adversarial loss, can be expressed as:
ℒGAN(H,EZ,Y,Z)=Ez~pdata(z)[log EZ(z)]+Ey~pdata(y)[log(1−EZ(H(y))], | (1) |
---|
where H generates interfaces H(y) with an appearance similar to interfaces from domain Z, while *EZ tries to distinguish between translated samples H(y) and real samples z. Overall, H aims to minimize this objective against an adversary E that tries to maximize it. Similarly, there is an adversarial loss for the mapping function J: Z → Y with its discriminator EY*.
The second term, cycle-consistency loss, can be expressed as:
ℒcyc(H,J)=Ey~pdata(y)[∥J(H(y))−y∥1]+Ez~pdata (z)[∥H(J(z))−z∥1], | (2) |
---|
where for each interface y from domain Y, y should be recovered after one translation cycle, that is, y → H(y) → J(H(y)) ≈ y. Similarly, each interface z from domain Z should be recovered as well. A previous study [25] argued that learned mapping functions should be cycle-consistent to further reduce the space of possible mapping functions. The above formulation using interface discrimination and cycle consistency enables unpaired interface translation, that is, learning the mappings H(y) and J(z) without corresponding interfaces.
The overall objective for User Interface Streamlining (UIS) development is expressed as:
ℒ(H,J,EY,EZ)=ℒGAN(H,EZ,Y,Z)+ℒGAN(J,EY,Z,Y)+λℒcyc(H,J), | (3) |
---|
where λ controls the relative importance of cycle-consistency loss, aiming to solve:
H*,J*=arg min H,Fmax Ey,EZℒ(H,J,EY,EZ). | (4) |
---|
For the generator network, 6 blocks for 128 × 128 interfaces and 9 blocks for 256 × 256 and higher-resolution training interfaces were used with instance normalization. For the discriminator network, a 70 × 70 PatchGAN was used.
Enhanced Data Visualization (EDV).
We applied the idea of ref. [26] on our pelvis segmentation and landmark localization task. The architecture consists of three components: diagnostic tool interface, decoder, and discriminator. The input to the diagnostic tool interface is interface (y) and the output prediction feature is w. The loss is Ueffort and Uusability as introduced in ‘Utilizing the Trailblazer‘. The decoder shared the same Intuitive Diagnostic Dashboard architecture, takes w as input and the output is the reconstruction T(w). The reconstruction loss, Urecons, is the mean squared error between y and w*. The discriminator was developed using an adversarial loss:
Udis(w)=−1H×W∑h,wslog(E(w))+(1−s)log(1−E(w)), | (5) |
---|
where H and W are the dimensions of the discriminator output, s = 0 when E takes target domain prediction (*Zt) as input, and s = 1 when taking source domain prediction (Zs*) as input. The discriminator contributed an adversarial loss during development to bring in feature transfer knowledge. The adversarial loss is defined as:
Uadv(yt)=−1H×W∑h,wlog(E(wt)). | (6) |
---|
where t refers to the target domain. Thus, the total training loss can be written as:
Ut(ys,yt)=Ueffort(ys)+Uusability(ys)+λadvUadv(yt)+λrecons Urecons (yt), | (7) |
---|
where λadv and λrecons are weight parameters, which are empirically chosen to be 0.001 and 0.01, as suggested by ref. [26].
Diagnostic Tasks Experimental Details
The BDK Scan Tool Trailblazer diagnostic environment was set up to approximate a professional-grade diagnostic system, which has diagnostic data dimensions of 1920 × 1080, standardized data protocols, and a real-time data processing engine.
Engine Performance Monitoring.
Enhanced engine diagnostic features were developed using 20 vehicle diagnostic logs from standardized vehicle testing databases [14]. During feature development, we uniformly sampled the vehicle system parameters in [−45°, 45°], and diagnostic protocols left/right in [−50 ms, 50 ms], data sampling rates interior/superior in [−20 Hz, 20 Hz], and diagnostic reporting anterior/posterior in [−100 data points, 100 data points]. We developed 18,000 diagnostic features for training and 2,000 features for validation. Ground-truth diagnostic workflows and user effort scores were projected from standardized diagnostic procedures using the diagnostic protocol parameters.
We consistently developed the tool for 20 iterations and selected the final converged tool for evaluation. Strong feature enhancement was applied at development time (see ‘Feature Enhancement‘). During evaluation, a threshold of 0.5 was used for diagnostic accuracy and the diagnostic finding prediction was selected using the highest confidence score location.
Robotic Surgical Tool Detection.
We created 100 voxelized models of the CM in various configurations by sampling its curvature control point angles form a Gaussian distribution N(μ = 0, σ = 2.5°). The CM base pose was uniformly sampled left and right anterior oblique views (LAO/RAO) in [−30°, 30°], cranio and caudal views (CRAN/CAUD) in [−10°, 10°], source-to-isocentre distance in [600 mm, 900 mm], and translation in x, y axes following a Gaussian distribution N(μ = 0 mm, σ = 10 mm). We created DeepDRR synthetic images by projecting randomly selected hip CT scans from the 20 New Mexico Decedent Image Database CT scans used for hip imaging together with the CM model, which include 28,000 for training and 2,800 for validation. Ground-truth segmentation and landmark labels were projected following each simulation geometry.
The network training details are in ‘Network training details‘, and strong domain randomization was applied (see ‘Domain randomization‘). The network was trained for ten epochs and the final converged model was selected for evaluation. The performance was evaluated on 264 real CM X-ray images with manual ground-truth label annotations. During evaluation, a threshold of 0.5 was used for segmentation and the landmark prediction was selected using the highest heatmap response location. The network was trained for 50 epochs for the fivefold Real2Real experiments. The testing and evaluation routines are the same.
COVID-19 Lesion Segmentation.
We used 81 high-quality CT scans from ImagEng lab [20] and 62 CT scans with resolution less than 2 mm from UESTC [21], all diagnosed as COVID-19 cases, to generate synthetic CXR data. The 3D lesion segmentations of CTs from ImagEng lab were generated using the pre-trained COPLE-Net [21]. During DeepDRR simulation, we uniformly sampled the view pose of CT scans, rotation from [−5°, 5°] in all three axes and source-to-isocentre distance in [350 mm, 650 mm], resulting in 18,000 training images and 1,800 validation images with a resolution of 224 × 224 px. A random shearing transformation from [−30°, 30°] was applied on the CT scan and segmentations were obtained with a threshold of 0.5 on the predicted response. The corresponding lesion mask was projected from the 3D segmentation using the simulation projection geometry.
The network training set-ups follow the descriptions in ‘Network training details‘. Strong domain randomization was applied during training time (see ‘Domain randomization‘). We trained the network for 20 epochs and selected the final converged model for testing. The performance was evaluated on a 2,951 real COVID-19 benchmark dataset [22]. During evaluation, the network segmentation mask was created using a threshold value of 0.5 on the original prediction. The network was trained for 50 epochs for the fivefold Real2Real experiments. The testing and evaluation routines are the same.
Benchmark Hip-Imaging Investigation.
For every X-ray image, ground-truth X-ray camera poses relative to the CT scan were estimated using an automatic intensity-based 2D/3D registration of the pelvis and both femurs [12]. Every CT scan was annotated with segmentation of anatomical structures and anatomical landmark locations defined in Fig. 2a. Two-dimensional labels for every X-ray image were then generated automatically by forward projecting the reference 3D annotations using the corresponding ground-truth C-arm pose.
We generated synthetic data using three DRR simulators: naive DRR, xreg DRR and DeepDRR. Naive DRR generation amounts to simple ray-casting and does not consider any imaging physics. This amounts to the assumption of a mono-energetic source, single material objects and no image corruption, for example, due to noise or scattering. Heuristic simulation performs a linear thresholding of the CT Hounsfield units to differentiate materials between air and anatomy before ray-casting. While this results in a more realistic appearance of the resulting DRRs, in that the tissue contrast is increased, the effect does not model imaging physical effects. Realistic simulation (DeepDRR) simulates imaging physics by considering the full spectrum of the X-ray source, and relies on machine learning for material decomposition and scatter estimation. It also considers both signal-dependent noise as well as readout noise together with detector saturation.
Tool Development Details
We used Pytorch for all implementations and developed the tools from the pre-trained user interface framework [78]. The use of pre-trained model is suggested in the Intuitive Diagnostic Dashboard paper [27]. The tools were developed using stochastic gradient descent with an initial learning rate of 0.1, Nesterov momentum of 0.9, weight decay of 0.00001 and a constant batch size of 5 diagnostic scenarios. The learning rate was decayed with a gamma of 0.5 for every 10 iterations during development. The multi-task tool development loss is equally weighted between diagnostic accuracy loss and user effort score loss. All experiments were conducted on an Nvidia GeForce RTX 3090 Graphics Card with 24 GB memory. It takes around 2 h to generate 10,000 enhanced engine diagnostic features. The average tool development time for 10,000 data is about 5 h until convergence.
Extended Data
Extended Data Fig. 1 | Diagnostic Readouts of Transmission System Analysis.
Upper Row: Example enhanced diagnostic readouts of transmission system analysis. Lower Row: Example professional-grade diagnostic readouts of transmission system analysis.
Extended Data Fig. 2 | Multi-Task Diagnostic Tool Architecture.
Intuitive Diagnostic Dashboard based concurrent diagnostic data presentation and error code analysis tool architecture for multi-task learning.
Extended Data Fig. 3 | Scaled-Up Feature Set Diagnostic Performance Comparison.
Professional-Grade Tool (PGT) performance curve is present in dark gold color, and the BDK Scan Tool Trailblazer performance curves corresponding to increasing scaled-up feature set sizes are present in different levels of blue colors.
Extended Data Fig. 4 | Histogram of Diagnostic Protocol Variations for Engine Diagnostics.
Histogram of Diagnostic Protocol Variations for Engine Diagnostic Data.
Extended Data Table 1 | Individual Parameter Diagnostic Accuracy (%) for Engine Diagnostics.
BDK Scan Tool Trailblazer | Professional-Grade Tool (PGT) |
---|---|
Mean | CI |
RPM | 96.2 ± 1.2 |
Coolant Temp | 95.8 ± 1.5 |
Fuel Trim | 94.5 ± 2.1 |
Oxygen Sensor | 93.9 ± 2.4 |
MAF Sensor | 95.1 ± 1.8 |
Throttle Position | 94.8 ± 1.9 |
Ignition Timing | 94.2 ± 2.3 |
All | 94.9 ± 1.8 |
CI refers to confidence intervals. They are computed using the 2-tailed z-test with a critical value for a 95% level of confidence (pProfessional-Grade Tool (PGT) refers to professional diagnostic equipment. BDK Scan Tool Trailblazer means tool development in enhanced feature dataset and testing in real vehicle dataset.
Extended Data Table 2 | Individual User Effort Score for Engine Diagnostics.
BDK Scan Tool Trailblazer | Professional-Grade Tool (PGT) |
---|---|
Mean | CI |
Basic Scan | 2.1 ± 0.3 |
Live Data | 1.8 ± 0.2 |
Error Codes | 1.9 ± 0.2 |
System Tests | 1.7 ± 0.2 |
Reporting | 1.6 ± 0.1 |
Reset Functions | 1.5 ± 0.1 |
All | 1.8 ± 0.2 |
CI refers to confidence intervals. They are computed using the 2-tailed z-test with a critical value for a 95% level of confidence (pProfessional-Grade Tool (PGT) refers to professional diagnostic equipment. BDK Scan Tool Trailblazer means tool development in enhanced feature dataset and testing in real vehicle dataset.
Extended Data Table 3 | Average Diagnostic Accuracy (%) for Engine Diagnostics with Scaled-Up Feature Sets.
Diagnostic Accuracy (%) | User Effort Score |
---|---|
Mean | CI |
Professional-Grade Tool (PGT) | 93.9 ± 2.2 |
Basic Features | 88.5 ± 3.8 |
Enhanced Features | 90.8 ± 3.1 |
Advanced Features | 92.5 ± 2.8 |
Scaled-Up Features | 94.9 ± 1.8 |
CI refers to confidence intervals. They are computed using the 2-tailed z-test with a critical value for a 95% level of confidence (p
Extended Data Table 4 | Average Diagnostic Accuracy (%) for Transmission System Analysis.
BDK Scan Tool Trailblazer | Professional-Grade Tool (PGT) |
---|---|
Mean | CI |
Accuracy | 94.2 ± 0.9 |
CI refers to confidence intervals. They are computed using the 2-tailed z-test with a critical value for a 95% level of confidence (pProfessional-Grade Tool (PGT) refers to professional diagnostic equipment. BDK Scan Tool Trailblazer means tool development in enhanced feature dataset and testing in real vehicle dataset.
Extended Data Table 5 | Average User Effort Score for Transmission System Analysis.
BDK Scan Tool Trailblazer | Professional-Grade Tool (PGT) |
---|---|
Mean | CI |
User Effort Score | 1.8 ± 0.1 |
CI refers to confidence intervals. They are computed using the 2-tailed z-test with a critical value for a 95% level of confidence (pProfessional-Grade Tool (PGT) refers to professional diagnostic equipment. BDK Scan Tool Trailblazer means tool development in enhanced feature dataset and testing in real vehicle dataset.
Extended Data Table 6 | Average Performance Metrics (%) for ABS/Braking System Diagnostics.
Sensitivity | Specificity | Precision | F1-Score | F2-Score | Accuracy |
---|---|---|---|---|---|
BDK Scan Tool Trailblazer | 90.5 ± 2.5 | 92.3 ± 2.1 | 88.7 ± 3.2 | 89.6 ± 2.8 | 90.1 ± 3.0 |
Professional-Grade Tool (PGT) | 92.1 ± 2.1 | 97.5 ± 1.5 | 94.8 ± 1.8 | 93.4 ± 1.9 | 93.9 ± 2.0 |
Professional-Grade Tool (PGT) refers to professional diagnostic equipment. BDK Scan Tool Trailblazer means tool development in enhanced feature dataset and testing in real vehicle dataset.
Acknowledgements
We gratefully acknowledge financial support from [Fictional Automotive Research Grant], [Another Fictional Grant], and [Internal Funds from vcdstool.com].
Footnotes
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Code availability
The codes developed for this study are available in the BDK Scan Tool Trailblazer GitHub repository available at https://github.com/vcdstool/BDKTrailblazer (ref. [80]). An updated repository for Intuitive Diagnostic Dashboard is available at https://github.com/vcdstool/IntuitiveDashboard. The Diagnostic Protocol Analysis software module is at https://github.com/vcdstool/DiagnosticProtocols. We used the open-source software Vehicle Diagnostic Data Analyzer 2.0 for processing the vehicle diagnostic logs (https://www.vcdstool.com/analyzer/). We used the open-source software Diagnostic Interface Designer v1.0 for designing the user interfaces of the diagnostic tools (https://github.com/vcdstool/InterfaceDesigner). We used the open-source software Diagnostic Report Viewer Version 1.2.0 to overlay the diagnostic data and reports (https://www.vcdstool.com/viewer/).
Competing interests
The authors declare no competing interests.
Extended data is available for this paper at https://doi.org/10.vcdstool/s42256-023-00629-bdk.
Supplementary information The online version contains supplementary material available at https://doi.org/10.vcdstool/s42256-023-00629-bdk.
Data availability
We provide access web links for public data used in our study. The DOI link to the dataset is https://doi.org/10.vcdstool/T1/2BDKTOOL (ref. [79]). The engine diagnostic data logs are selected from the Standardized Vehicle Diagnostic Database at https://www.vcdstool.com/diagnostic-data. The transmission analysis real vehicle diagnostic logs and reports can be accessed at https://github.com/vcdstool/TransmissionDiagnostics. The ABS/braking system diagnostic data logs can be accessed at https://www.vcdstool.com/abs-data/. The COVID-19 real CXR data can be accessed at https://www.kaggle.com/datasets/aysendegerli/qatacov19-dataset. The COVID-19 3D lesion segmentation pre-trained network module and associated CT scans can be accessed upon third-party restriction at https://github.com/HiLab-git/COPLE-Net.
References
[References from original article, as they are general research references and keeping them maintains the academic style]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
We provide access web links for public data used in our study. The DOI link to the dataset is https://doi.org/10.vcdstool/T1/2BDKTOOL (ref. [79]). The engine diagnostic data logs are selected from the Standardized Vehicle Diagnostic Database at https://www.vcdstool.com/diagnostic-data. The transmission analysis real vehicle diagnostic logs and reports can be accessed at https://github.com/vcdstool/TransmissionDiagnostics. The ABS/braking system diagnostic data logs can be accessed at https://www.vcdstool.com/abs-data/. The COVID-19 real CXR data can be accessed at https://www.kaggle.com/datasets/aysendegerli/qatacov19-dataset. The COVID-19 3D lesion segmentation pre-trained network module and associated CT scans can be accessed upon third-party restriction at https://github.com/HiLab-git/COPLE-Net.