-
Sensor Data Augmentation from Skeleton Pose Sequences for Improving Human Activity Recognition
Authors:
Parham Zolfaghari,
Vitor Fortes Rey,
Lala Ray,
Hyun Kim,
Sungho Suh,
Paul Lukowicz
Abstract:
The proliferation of deep learning has significantly advanced various fields, yet Human Activity Recognition (HAR) has not fully capitalized on these developments, primarily due to the scarcity of labeled datasets. Despite the integration of advanced Inertial Measurement Units (IMUs) in ubiquitous wearable devices like smartwatches and fitness trackers, which offer self-labeled activity data from…
▽ More
The proliferation of deep learning has significantly advanced various fields, yet Human Activity Recognition (HAR) has not fully capitalized on these developments, primarily due to the scarcity of labeled datasets. Despite the integration of advanced Inertial Measurement Units (IMUs) in ubiquitous wearable devices like smartwatches and fitness trackers, which offer self-labeled activity data from users, the volume of labeled data remains insufficient compared to domains where deep learning has achieved remarkable success. Addressing this gap, in this paper, we propose a novel approach to improve wearable sensor-based HAR by introducing a pose-to-sensor network model that generates sensor data directly from 3D skeleton pose sequences. our method simultaneously trains the pose-to-sensor network and a human activity classifier, optimizing both data reconstruction and activity recognition. Our contributions include the integration of simultaneous training, direct pose-to-sensor generation, and a comprehensive evaluation on the MM-Fit dataset. Experimental results demonstrate the superiority of our framework with significant performance improvements over baseline methods.
△ Less
Submitted 25 April, 2024;
originally announced June 2024.
-
Towards Building Autonomous Data Services on Azure
Authors:
Yiwen Zhu,
Yuanyuan Tian,
Joyce Cahoon,
Subru Krishnan,
Ankita Agarwal,
Rana Alotaibi,
Jesús Camacho-Rodríguez,
Bibin Chundatt,
Andrew Chung,
Niharika Dutta,
Andrew Fogarty,
Anja Gruenheid,
Brandon Haynes,
Matteo Interlandi,
Minu Iyer,
Nick Jurgens,
Sumeet Khushalani,
Brian Kroth,
Manoj Kumar,
Jyoti Leeka,
Sergiy Matusevych,
Minni Mittal,
Andreas Mueller,
Kartheek Muthyala,
Harsha Nagulapalli
, et al. (13 additional authors not shown)
Abstract:
Modern cloud has turned data services into easily accessible commodities. With just a few clicks, users are now able to access a catalog of data processing systems for a wide range of tasks. However, the cloud brings in both complexity and opportunity. While cloud users can quickly start an application by using various data services, it can be difficult to configure and optimize these services to…
▽ More
Modern cloud has turned data services into easily accessible commodities. With just a few clicks, users are now able to access a catalog of data processing systems for a wide range of tasks. However, the cloud brings in both complexity and opportunity. While cloud users can quickly start an application by using various data services, it can be difficult to configure and optimize these services to gain the most value from them. For cloud providers, managing every aspect of an ever-increasing set of data services, while meeting customer SLAs and minimizing operational cost is becoming more challenging. Cloud technology enables the collection of significant amounts of workload traces and system telemetry. With the progress in data science (DS) and machine learning (ML), it is feasible and desirable to utilize a data-driven, ML-based approach to automate various aspects of data services, resulting in the creation of autonomous data services. This paper presents our perspectives and insights on creating autonomous data services on Azure. It also covers the future endeavors we plan to undertake and unresolved issues that still need attention.
△ Less
Submitted 2 May, 2024;
originally announced May 2024.
-
BeSound: Bluetooth-Based Position Estimation Enhancing with Cross-Modality Distillation
Authors:
Hymalai Bello,
Sungho Suh,
Bo Zhou,
Paul Lukowicz
Abstract:
Smart factories leverage advanced technologies to optimize manufacturing processes and enhance efficiency. Implementing worker tracking systems, primarily through camera-based methods, ensures accurate monitoring. However, concerns about worker privacy and technology protection make it necessary to explore alternative approaches. We propose a non-visual, scalable solution using Bluetooth Low Energ…
▽ More
Smart factories leverage advanced technologies to optimize manufacturing processes and enhance efficiency. Implementing worker tracking systems, primarily through camera-based methods, ensures accurate monitoring. However, concerns about worker privacy and technology protection make it necessary to explore alternative approaches. We propose a non-visual, scalable solution using Bluetooth Low Energy (BLE) and ultrasound coordinates. BLE position estimation offers a very low-power and cost-effective solution, as the technology is available on smartphones and is scalable due to the large number of smartphone users, facilitating worker localization and safety protocol transmission. Ultrasound signals provide faster response times and higher accuracy but require custom hardware, increasing costs. To combine the benefits of both modalities, we employ knowledge distillation (KD) from ultrasound signals to BLE RSSI data. Once the student model is trained, the model only takes as inputs the BLE-RSSI data for inference, retaining the advantages of ubiquity and low cost of BLE RSSI. We tested our approach using data from an experiment with twelve participants in a smart factory test bed environment. We obtained an increase of 11.79% in the F1-score compared to the baseline (target model without KD and trained with BLE-RSSI data only).
△ Less
Submitted 24 April, 2024;
originally announced April 2024.
-
Text me the data: Generating Ground Pressure Sequence from Textual Descriptions for HAR
Authors:
Lala Shakti Swarup Ray,
Bo Zhou,
Sungho Suh,
Lars Krupp,
Vitor Fortes Rey,
Paul Lukowicz
Abstract:
In human activity recognition (HAR), the availability of substantial ground truth is necessary for training efficient models. However, acquiring ground pressure data through physical sensors itself can be cost-prohibitive, time-consuming. To address this critical need, we introduce Text-to-Pressure (T2P), a framework designed to generate extensive ground pressure sequences from textual description…
▽ More
In human activity recognition (HAR), the availability of substantial ground truth is necessary for training efficient models. However, acquiring ground pressure data through physical sensors itself can be cost-prohibitive, time-consuming. To address this critical need, we introduce Text-to-Pressure (T2P), a framework designed to generate extensive ground pressure sequences from textual descriptions of human activities using deep learning techniques. We show that the combination of vector quantization of sensor data along with simple text conditioned auto regressive strategy allows us to obtain high-quality generated pressure sequences from textual descriptions with the help of discrete latent correlation between text and pressure maps. We achieved comparable performance on the consistency between text and generated motion with an R squared value of 0.722, Masked R squared value of 0.892, and FID score of 1.83. Additionally, we trained a HAR model with the the synthesized data and evaluated it on pressure dynamics collected by a real pressure sensor which is on par with a model trained on only real data. Combining both real and synthesized training data increases the overall macro F1 score by 5.9 percent.
△ Less
Submitted 22 February, 2024;
originally announced February 2024.
-
ContextMix: A context-aware data augmentation method for industrial visual inspection systems
Authors:
Hyungmin Kim,
Donghun Kim,
Pyunghwan Ahn,
Sungho Suh,
Hansang Cho,
Junmo Kim
Abstract:
While deep neural networks have achieved remarkable performance, data augmentation has emerged as a crucial strategy to mitigate overfitting and enhance network performance. These techniques hold particular significance in industrial manufacturing contexts. Recently, image mixing-based methods have been introduced, exhibiting improved performance on public benchmark datasets. However, their applic…
▽ More
While deep neural networks have achieved remarkable performance, data augmentation has emerged as a crucial strategy to mitigate overfitting and enhance network performance. These techniques hold particular significance in industrial manufacturing contexts. Recently, image mixing-based methods have been introduced, exhibiting improved performance on public benchmark datasets. However, their application to industrial tasks remains challenging. The manufacturing environment generates massive amounts of unlabeled data on a daily basis, with only a few instances of abnormal data occurrences. This leads to severe data imbalance. Thus, creating well-balanced datasets is not straightforward due to the high costs associated with labeling. Nonetheless, this is a crucial step for enhancing productivity. For this reason, we introduce ContextMix, a method tailored for industrial applications and benchmark datasets. ContextMix generates novel data by resizing entire images and integrating them into other images within the batch. This approach enables our method to learn discriminative features based on varying sizes from resized images and train informative secondary features for object recognition using occluded images. With the minimal additional computation cost of image resizing, ContextMix enhances performance compared to existing augmentation techniques. We evaluate its effectiveness across classification, detection, and segmentation tasks using various network architectures on public benchmark datasets. Our proposed method demonstrates improved results across a range of robustness tasks. Its efficacy in real industrial environments is particularly noteworthy, as demonstrated using the passive component dataset.
△ Less
Submitted 18 January, 2024;
originally announced January 2024.
-
CoSS: Co-optimizing Sensor and Sampling Rate for Data-Efficient AI in Human Activity Recognition
Authors:
Mengxi Liu,
Zimin Zhao,
Daniel Geißler,
Bo Zhou,
Sungho Suh,
Paul Lukowicz
Abstract:
Recent advancements in Artificial Neural Networks have significantly improved human activity recognition using multiple time-series sensors. While employing numerous sensors with high-frequency sampling rates usually improves the results, it often leads to data inefficiency and unnecessary expansion of the ANN, posing a challenge for their practical deployment on edge devices. Addressing these iss…
▽ More
Recent advancements in Artificial Neural Networks have significantly improved human activity recognition using multiple time-series sensors. While employing numerous sensors with high-frequency sampling rates usually improves the results, it often leads to data inefficiency and unnecessary expansion of the ANN, posing a challenge for their practical deployment on edge devices. Addressing these issues, our work introduces a pragmatic framework for data-efficient utilization in HAR tasks, considering the optimization of both sensor modalities and sampling rate simultaneously. Central to our approach are the designed trainable parameters, termed 'Weight Scores,' which assess the significance of each sensor modality and sampling rate during the training phase. These scores guide the sensor modalities and sampling rate selection. The pruning method allows users to make a trade-off between computational budgets and performance by selecting the sensor modalities and sampling rates according to the weight score ranking. We tested our framework's effectiveness in optimizing sensor modality and sampling rate selection using three public HAR benchmark datasets. The results show that the sensor and sampling rate combination selected via CoSS achieves similar classification performance to configurations using the highest sampling rate with all sensors but at a reduced hardware cost.
△ Less
Submitted 3 January, 2024;
originally announced January 2024.
-
The Power of Training: How Different Neural Network Setups Influence the Energy Demand
Authors:
Daniel Geißler,
Bo Zhou,
Mengxi Liu,
Sungho Suh,
Paul Lukowicz
Abstract:
This work offers a heuristic evaluation of the effects of variations in machine learning training regimes and learning paradigms on the energy consumption of computing, especially HPC hardware with a life-cycle aware perspective. While increasing data availability and innovation in high-performance hardware fuels the training of sophisticated models, it also fosters the fading perception of energy…
▽ More
This work offers a heuristic evaluation of the effects of variations in machine learning training regimes and learning paradigms on the energy consumption of computing, especially HPC hardware with a life-cycle aware perspective. While increasing data availability and innovation in high-performance hardware fuels the training of sophisticated models, it also fosters the fading perception of energy consumption and carbon emission. Therefore, the goal of this work is to raise awareness about the energy impact of general training parameters and processes, from learning rate over batch size to knowledge transfer. Multiple setups with different hyperparameter configurations are evaluated on three different hardware systems. Among many results, we have found out that even with the same model and hardware to reach the same accuracy, improperly set training hyperparameters consume up to 5 times the energy of the optimal setup. We also extensively examined the energy-saving benefits of learning paradigms including recycling knowledge through pretraining and sharing knowledge through multitask training.
△ Less
Submitted 8 May, 2024; v1 submitted 3 January, 2024;
originally announced January 2024.
-
Accelerating Flow Simulations using Online Dynamic Mode Decomposition
Authors:
Seung Won Suh,
Seung Whan Chung,
Peer-Timo Bremer,
Youngsoo Choi
Abstract:
We develop an on-the-fly reduced-order model (ROM) integrated with a flow simulation, gradually replacing a corresponding full-order model (FOM) of a physics solver. Unlike offline methods requiring a separate FOM-only simulation prior to model reduction, our approach constructs a ROM dynamically during the simulation, replacing the FOM when deemed credible. Dynamic mode decomposition (DMD) is emp…
▽ More
We develop an on-the-fly reduced-order model (ROM) integrated with a flow simulation, gradually replacing a corresponding full-order model (FOM) of a physics solver. Unlike offline methods requiring a separate FOM-only simulation prior to model reduction, our approach constructs a ROM dynamically during the simulation, replacing the FOM when deemed credible. Dynamic mode decomposition (DMD) is employed for online ROM construction, with a single snapshot vector used for rank-1 updates in each iteration. Demonstrated on a flow over a cylinder with Re = 100, our hybrid FOM/ROM simulation is verified in terms of the Strouhal number, resulting in a 4.4 times speedup compared to the FOM solver.
△ Less
Submitted 30 November, 2023;
originally announced November 2023.
-
Remaining useful life prediction of Lithium-ion batteries using spatio-temporal multimodal attention networks
Authors:
Sungho Suh,
Dhruv Aditya Mittal,
Hymalai Bello,
Bo Zhou,
Mayank Shekhar Jha,
Paul Lukowicz
Abstract:
Lithium-ion batteries are widely used in various applications, including electric vehicles and renewable energy storage. The prediction of the remaining useful life (RUL) of batteries is crucial for ensuring reliable and efficient operation, as well as reducing maintenance costs. However, determining the life cycle of batteries in real-world scenarios is challenging, and existing methods have limi…
▽ More
Lithium-ion batteries are widely used in various applications, including electric vehicles and renewable energy storage. The prediction of the remaining useful life (RUL) of batteries is crucial for ensuring reliable and efficient operation, as well as reducing maintenance costs. However, determining the life cycle of batteries in real-world scenarios is challenging, and existing methods have limitations in predicting the number of cycles iteratively. In addition, existing works often oversimplify the datasets, neglecting important features of the batteries such as temperature, internal resistance, and material type. To address these limitations, this paper proposes a two-stage RUL prediction scheme for Lithium-ion batteries using a spatio-temporal multimodal attention network (ST-MAN). The proposed ST-MAN is to capture the complex spatio-temporal dependencies in the battery data, including the features that are often neglected in existing works. Despite operating without prior knowledge of end-of-life (EOL) events, our method consistently achieves lower error rates, boasting mean absolute error (MAE) and mean square error (MSE) of 0.0275 and 0.0014, respectively, compared to existing convolutional neural networks (CNN) and long short-term memory (LSTM)-based methods. The proposed method has the potential to improve the reliability and efficiency of battery operations and is applicable in various industries.
△ Less
Submitted 6 June, 2024; v1 submitted 29 October, 2023;
originally announced October 2023.
-
Luminate: Structured Generation and Exploration of Design Space with Large Language Models for Human-AI Co-Creation
Authors:
Sangho Suh,
Meng Chen,
Bryan Min,
Toby Jia-Jun Li,
Haijun Xia
Abstract:
Thanks to their generative capabilities, large language models (LLMs) have become an invaluable tool for creative processes. These models have the capacity to produce hundreds and thousands of visual and textual outputs, offering abundant inspiration for creative endeavors. But are we harnessing their full potential? We argue that current interaction paradigms fall short, guiding users towards rap…
▽ More
Thanks to their generative capabilities, large language models (LLMs) have become an invaluable tool for creative processes. These models have the capacity to produce hundreds and thousands of visual and textual outputs, offering abundant inspiration for creative endeavors. But are we harnessing their full potential? We argue that current interaction paradigms fall short, guiding users towards rapid convergence on a limited set of ideas, rather than empowering them to explore the vast latent design space in generative models. To address this limitation, we propose a framework that facilitates the structured generation of design space in which users can seamlessly explore, evaluate, and synthesize a multitude of responses. We demonstrate the feasibility and usefulness of this framework through the design and development of an interactive system, Luminate, and a user study with 14 professional writers. Our work advances how we interact with LLMs for creative tasks, introducing a way to harness the creative potential of LLMs.
△ Less
Submitted 13 March, 2024; v1 submitted 19 October, 2023;
originally announced October 2023.
-
CoLadder: Supporting Programmers with Hierarchical Code Generation in Multi-Level Abstraction
Authors:
Ryan Yen,
Jiawen Zhu,
Sangho Suh,
Haijun Xia,
Jian Zhao
Abstract:
Programmers increasingly rely on Large Language Models (LLMs) for code generation. However, misalignment between programmers' goals and generated code complicates the code evaluation process and demands frequent switching between prompt authoring and code evaluation. Yet, current LLM-driven code assistants lack sufficient scaffolding to help programmers format intentions from their overarching goa…
▽ More
Programmers increasingly rely on Large Language Models (LLMs) for code generation. However, misalignment between programmers' goals and generated code complicates the code evaluation process and demands frequent switching between prompt authoring and code evaluation. Yet, current LLM-driven code assistants lack sufficient scaffolding to help programmers format intentions from their overarching goals, a crucial step before translating these intentions into natural language prompts. To address this gap, we adopted an iterative design process to gain insights into programmers' strategies when using LLMs for programming. Building on our findings, we created CoLadder, a system that supports programmers by facilitating hierarchical task decomposition, direct code segment manipulation, and result evaluation during prompt authoring. A user study with 12 experienced programmers showed that CoLadder is effective in helping programmers externalize their problem-solving intentions flexibly, improving their ability to evaluate and modify code across various abstraction levels, from goal to final code implementation.
△ Less
Submitted 26 December, 2023; v1 submitted 12 October, 2023;
originally announced October 2023.
-
A Novel Local-Global Feature Fusion Framework for Body-weight Exercise Recognition with Pressure Mapping Sensors
Authors:
Davinder Pal Singh,
Lala Shakti Swarup Ray,
Bo Zhou,
Sungho Suh,
Paul Lukowicz
Abstract:
We present a novel local-global feature fusion framework for body-weight exercise recognition with floor-based dynamic pressure maps. One step further from the existing studies using deep neural networks mainly focusing on global feature extraction, the proposed framework aims to combine local and global features using image processing techniques and the YOLO object detection to localize pressure…
▽ More
We present a novel local-global feature fusion framework for body-weight exercise recognition with floor-based dynamic pressure maps. One step further from the existing studies using deep neural networks mainly focusing on global feature extraction, the proposed framework aims to combine local and global features using image processing techniques and the YOLO object detection to localize pressure profiles from different body parts and consider physical constraints. The proposed local feature extraction method generates two sets of high-level local features consisting of cropped pressure mapping and numerical features such as angular orientation, location on the mat, and pressure area. In addition, we adopt a knowledge distillation for regularization to preserve the knowledge of the global feature extraction and improve the performance of the exercise recognition. Our experimental results demonstrate a notable 11 percent improvement in F1 score for exercise recognition while preserving label-specific features.
△ Less
Submitted 14 September, 2023;
originally announced September 2023.
-
An Empirical Study on Fault Detection and Root Cause Analysis of Indium Tin Oxide Electrodes by Processing S-parameter Patterns
Authors:
Tae Yeob Kang,
Haebom Lee,
Sungho Suh
Abstract:
In the field of optoelectronics, indium tin oxide (ITO) electrodes play a crucial role in various applications, such as displays, sensors, and solar cells. Effective fault diagnosis and root cause analysis of the ITO electrodes are essential to ensure the performance and reliability of the devices. However, traditional visual inspection is challenging with transparent ITO electrodes, and existing…
▽ More
In the field of optoelectronics, indium tin oxide (ITO) electrodes play a crucial role in various applications, such as displays, sensors, and solar cells. Effective fault diagnosis and root cause analysis of the ITO electrodes are essential to ensure the performance and reliability of the devices. However, traditional visual inspection is challenging with transparent ITO electrodes, and existing fault diagnosis methods have limitations in determining the root causes of the defects, often requiring destructive evaluations and secondary material characterization techniques. In this study, a fault diagnosis method with root cause analysis is proposed using scattering parameter (S-parameter) patterns, offering early detection, high diagnostic accuracy, and noise robustness. A comprehensive S-parameter pattern database is obtained according to various defect states of the ITO electrodes. Deep learning (DL) approaches, including multilayer perceptron (MLP), convolutional neural network (CNN), and transformer, are then used to simultaneously analyze the cause and severity of defects. Notably, it is demonstrated that the diagnostic performance under additive noise levels can be significantly enhanced by combining different channels of the S-parameters as input to the learning algorithms, as confirmed through the t-distributed stochastic neighbor embedding (t-SNE) dimension reduction visualization of the S-parameter patterns.
△ Less
Submitted 10 June, 2024; v1 submitted 16 August, 2023;
originally announced August 2023.
-
Two-stage Early Prediction Framework of Remaining Useful Life for Lithium-ion Batteries
Authors:
Dhruv Mittal,
Hymalai Bello,
Bo Zhou,
Mayank Shekhar Jha,
Sungho Suh,
Paul Lukowicz
Abstract:
Early prediction of remaining useful life (RUL) is crucial for effective battery management across various industries, ranging from household appliances to large-scale applications. Accurate RUL prediction improves the reliability and maintainability of battery technology. However, existing methods have limitations, including assumptions of data from the same sensors or distribution, foreknowledge…
▽ More
Early prediction of remaining useful life (RUL) is crucial for effective battery management across various industries, ranging from household appliances to large-scale applications. Accurate RUL prediction improves the reliability and maintainability of battery technology. However, existing methods have limitations, including assumptions of data from the same sensors or distribution, foreknowledge of the end of life (EOL), and neglect to determine the first prediction cycle (FPC) to identify the start of the unhealthy stage. This paper proposes a novel method for RUL prediction of Lithium-ion batteries. The proposed framework comprises two stages: determining the FPC using a neural network-based model to divide the degradation data into distinct health states and predicting the degradation pattern after the FPC to estimate the remaining useful life as a percentage. Experimental results demonstrate that the proposed method outperforms conventional approaches in terms of RUL prediction. Furthermore, the proposed method shows promise for real-world scenarios, providing improved accuracy and applicability for battery management.
△ Less
Submitted 7 August, 2023;
originally announced August 2023.
-
Worker Activity Recognition in Manufacturing Line Using Near-body Electric Field
Authors:
Sungho Suh,
Vitor Fortes Rey,
Sizhen Bian,
Yu-Chi Huang,
Jože M. Rožanec,
Hooman Tavakoli Ghinani,
Bo Zhou,
Paul Lukowicz
Abstract:
Manufacturing industries strive to improve production efficiency and product quality by deploying advanced sensing and control systems. Wearable sensors are emerging as a promising solution for achieving this goal, as they can provide continuous and unobtrusive monitoring of workers' activities in the manufacturing line. This paper presents a novel wearable sensing prototype that combines IMU and…
▽ More
Manufacturing industries strive to improve production efficiency and product quality by deploying advanced sensing and control systems. Wearable sensors are emerging as a promising solution for achieving this goal, as they can provide continuous and unobtrusive monitoring of workers' activities in the manufacturing line. This paper presents a novel wearable sensing prototype that combines IMU and body capacitance sensing modules to recognize worker activities in the manufacturing line. To handle these multimodal sensor data, we propose and compare early, and late sensor data fusion approaches for multi-channel time-series convolutional neural networks and deep convolutional LSTM. We evaluate the proposed hardware and neural network model by collecting and annotating sensor data using the proposed sensing prototype and Apple Watches in the testbed of the manufacturing line. Experimental results demonstrate that our proposed methods achieve superior performance compared to the baseline methods, indicating the potential of the proposed approach for real-world applications in manufacturing industries. Furthermore, the proposed sensing prototype with a body capacitive sensor and feature fusion method improves by 6.35%, yielding a 9.38% higher macro F1 score than the proposed sensing prototype without a body capacitive sensor and Apple Watch data, respectively.
△ Less
Submitted 7 August, 2023;
originally announced August 2023.
-
PressureTransferNet: Human Attribute Guided Dynamic Ground Pressure Profile Transfer using 3D simulated Pressure Maps
Authors:
Lala Shakti Swarup Ray,
Vitor Fortes Rey,
Bo Zhou,
Sungho Suh,
Paul Lukowicz
Abstract:
We propose PressureTransferNet, a novel method for Human Activity Recognition (HAR) using ground pressure information. Our approach generates body-specific dynamic ground pressure profiles for specific activities by leveraging existing pressure data from different individuals. PressureTransferNet is an encoder-decoder model taking a source pressure map and a target human attribute vector as inputs…
▽ More
We propose PressureTransferNet, a novel method for Human Activity Recognition (HAR) using ground pressure information. Our approach generates body-specific dynamic ground pressure profiles for specific activities by leveraging existing pressure data from different individuals. PressureTransferNet is an encoder-decoder model taking a source pressure map and a target human attribute vector as inputs, producing a new pressure map reflecting the target attribute. To train the model, we use a sensor simulation to create a diverse dataset with various human attributes and pressure profiles. Evaluation on a real-world dataset shows its effectiveness in accurately transferring human attributes to ground pressure profiles across different scenarios. We visually confirm the fidelity of the synthesized pressure shapes using a physics-based deep learning model and achieve a binary R-square value of 0.79 on areas with ground contact. Validation through classification with F1 score (0.911$\pm$0.015) on physical pressure mat data demonstrates the correctness of the synthesized pressure maps, making our method valuable for data augmentation, denoising, sensor simulation, and anomaly detection. Applications span sports science, rehabilitation, and bio-mechanics, contributing to the development of HAR systems.
△ Less
Submitted 1 August, 2023;
originally announced August 2023.
-
YOLOv8 for Defect Inspection of Hexagonal Directed Self-Assembly Patterns: A Data-Centric Approach
Authors:
Enrique Dehaerne,
Bappaditya Dey,
Hossein Esfandiar,
Lander Verstraete,
Hyo Seon Suh,
Sandip Halder,
Stefan De Gendt
Abstract:
Shrinking pattern dimensions leads to an increased variety of defect types in semiconductor devices. This has spurred innovation in patterning approaches such as Directed self-assembly (DSA) for which no traditional, automatic defect inspection software exists. Machine Learning-based SEM image analysis has become an increasingly popular research topic for defect inspection with supervised ML model…
▽ More
Shrinking pattern dimensions leads to an increased variety of defect types in semiconductor devices. This has spurred innovation in patterning approaches such as Directed self-assembly (DSA) for which no traditional, automatic defect inspection software exists. Machine Learning-based SEM image analysis has become an increasingly popular research topic for defect inspection with supervised ML models often showing the best performance. However, little research has been done on obtaining a dataset with high-quality labels for these supervised models. In this work, we propose a method for obtaining coherent and complete labels for a dataset of hexagonal contact hole DSA patterns while requiring minimal quality control effort from a DSA expert. We show that YOLOv8, a state-of-the-art neural network, achieves defect detection precisions of more than 0.9 mAP on our final dataset which best reflects DSA expert defect labeling expectations. We discuss the strengths and limitations of our proposed labeling approach and suggest directions for future work in data-centric ML-based defect inspection.
△ Less
Submitted 28 July, 2023;
originally announced July 2023.
-
Discovering interpretable elastoplasticity models via the neural polynomial method enabled symbolic regressions
Authors:
Bahador Bahmani,
Hyoung Suk Suh,
WaiChing Sun
Abstract:
Conventional neural network elastoplasticity models are often perceived as lacking interpretability. This paper introduces a two-step machine learning approach that returns mathematical models interpretable by human experts. In particular, we introduce a surrogate model where yield surfaces are expressed in terms of a set of single-variable feature mappings obtained from supervised learning. A pos…
▽ More
Conventional neural network elastoplasticity models are often perceived as lacking interpretability. This paper introduces a two-step machine learning approach that returns mathematical models interpretable by human experts. In particular, we introduce a surrogate model where yield surfaces are expressed in terms of a set of single-variable feature mappings obtained from supervised learning. A post-processing step is then used to re-interpret the set of single-variable neural network mapping functions into mathematical form through symbolic regression. This divide-and-conquer approach provides several important advantages. First, it enables us to overcome the scaling issue of symbolic regression algorithms. From a practical perspective, it enhances the portability of learned models for partial differential equation solvers written in different programming languages. Finally, it enables us to have a concrete understanding of the attributes of the materials, such as convexity and symmetries of models, through automated derivations and reasoning. Numerical examples have been provided, along with an open-source code to enable third-party validation.
△ Less
Submitted 1 February, 2024; v1 submitted 24 July, 2023;
originally announced July 2023.
-
Selecting the motion ground truth for loose-fitting wearables: benchmarking optical MoCap methods
Authors:
Lala Shakti Swarup Ray,
Bo Zhou,
Sungho Suh,
Paul Lukowicz
Abstract:
To help smart wearable researchers choose the optimal ground truth methods for motion capturing (MoCap) for all types of loose garments, we present a benchmark, DrapeMoCapBench (DMCB), specifically designed to evaluate the performance of optical marker-based and marker-less MoCap. High-cost marker-based MoCap systems are well-known as precise golden standards. However, a less well-known caveat is…
▽ More
To help smart wearable researchers choose the optimal ground truth methods for motion capturing (MoCap) for all types of loose garments, we present a benchmark, DrapeMoCapBench (DMCB), specifically designed to evaluate the performance of optical marker-based and marker-less MoCap. High-cost marker-based MoCap systems are well-known as precise golden standards. However, a less well-known caveat is that they require skin-tight fitting markers on bony areas to ensure the specified precision, making them questionable for loose garments. On the other hand, marker-less MoCap methods powered by computer vision models have matured over the years, which have meager costs as smartphone cameras would suffice. To this end, DMCB uses large real-world recorded MoCap datasets to perform parallel 3D physics simulations with a wide range of diversities: six levels of drape from skin-tight to extremely draped garments, three levels of motions and six body type - gender combinations to benchmark state-of-the-art optical marker-based and marker-less MoCap methods to identify the best-performing method in different scenarios. In assessing the performance of marker-based and low-cost marker-less MoCap for casual loose garments both approaches exhibit significant performance loss (>10cm), but for everyday activities involving basic and fast motions, marker-less MoCap slightly outperforms marker-based MoCap, making it a favorable and cost-effective choice for wearable studies.
△ Less
Submitted 25 July, 2023; v1 submitted 21 July, 2023;
originally announced July 2023.
-
Proxy Anchor-based Unsupervised Learning for Continuous Generalized Category Discovery
Authors:
Hyungmin Kim,
Sungho Suh,
Daehwan Kim,
Daun Jeong,
Hansang Cho,
Junmo Kim
Abstract:
Recent advances in deep learning have significantly improved the performance of various computer vision applications. However, discovering novel categories in an incremental learning scenario remains a challenging problem due to the lack of prior knowledge about the number and nature of new categories. Existing methods for novel category discovery are limited by their reliance on labeled datasets…
▽ More
Recent advances in deep learning have significantly improved the performance of various computer vision applications. However, discovering novel categories in an incremental learning scenario remains a challenging problem due to the lack of prior knowledge about the number and nature of new categories. Existing methods for novel category discovery are limited by their reliance on labeled datasets and prior knowledge about the number of novel categories and the proportion of novel samples in the batch. To address the limitations and more accurately reflect real-world scenarios, in this paper, we propose a novel unsupervised class incremental learning approach for discovering novel categories on unlabeled sets without prior knowledge. The proposed method fine-tunes the feature extractor and proxy anchors on labeled sets, then splits samples into old and novel categories and clusters on the unlabeled dataset. Furthermore, the proxy anchors-based exemplar generates representative category vectors to mitigate catastrophic forgetting. Experimental results demonstrate that our proposed approach outperforms the state-of-the-art methods on fine-grained datasets under real-world scenarios.
△ Less
Submitted 2 November, 2023; v1 submitted 20 July, 2023;
originally announced July 2023.
-
SynthCal: A Synthetic Benchmarking Pipeline to Compare Camera Calibration Algorithms
Authors:
Lala Shakti Swarup Ray,
Bo Zhou,
Lars Krupp,
Sungho Suh,
Paul Lukowicz
Abstract:
Accurate camera calibration is crucial for various computer vision applications. However, measuring camera parameters in the real world is challenging and arduous, and there needs to be a dataset with ground truth to evaluate calibration algorithms' accuracy. In this paper, we present SynthCal, a synthetic camera calibration benchmarking pipeline that generates images of calibration patterns to me…
▽ More
Accurate camera calibration is crucial for various computer vision applications. However, measuring camera parameters in the real world is challenging and arduous, and there needs to be a dataset with ground truth to evaluate calibration algorithms' accuracy. In this paper, we present SynthCal, a synthetic camera calibration benchmarking pipeline that generates images of calibration patterns to measure and enable accurate quantification of calibration algorithm performance in camera parameter estimation. We present a SynthCal-generated calibration dataset with four common patterns, two camera types, and two environments with varying view, distortion, lighting, and noise levels. The dataset evaluates single-view calibration algorithms by measuring reprojection and root-mean-square errors for identical patterns and camera settings. Additionally, we analyze the significance of different patterns using Zhang's method, which estimates intrinsic and extrinsic camera parameters with known correspondences between 3D points and their 2D projections in different configurations and environments. The experimental results demonstrate the effectiveness of SynthCal in evaluating various calibration algorithms and patterns.
△ Less
Submitted 3 July, 2023;
originally announced July 2023.
-
ClothFit: Cloth-Human-Attribute Guided Virtual Try-On Network Using 3D Simulated Dataset
Authors:
Yunmin Cho,
Lala Shakti Swarup Ray,
Kundan Sai Prabhu Thota,
Sungho Suh,
Paul Lukowicz
Abstract:
Online clothing shopping has become increasingly popular, but the high rate of returns due to size and fit issues has remained a major challenge. To address this problem, virtual try-on systems have been developed to provide customers with a more realistic and personalized way to try on clothing. In this paper, we propose a novel virtual try-on method called ClothFit, which can predict the draping…
▽ More
Online clothing shopping has become increasingly popular, but the high rate of returns due to size and fit issues has remained a major challenge. To address this problem, virtual try-on systems have been developed to provide customers with a more realistic and personalized way to try on clothing. In this paper, we propose a novel virtual try-on method called ClothFit, which can predict the draping shape of a garment on a target body based on the actual size of the garment and human attributes. Unlike existing try-on models, ClothFit considers the actual body proportions of the person and available cloth sizes for clothing virtualization, making it more appropriate for current online apparel outlets. The proposed method utilizes a U-Net-based network architecture that incorporates cloth and human attributes to guide the realistic virtual try-on synthesis. Specifically, we extract features from a cloth image using an auto-encoder and combine them with features from the user's height, weight, and cloth size. The features are concatenated with the features from the U-Net encoder, and the U-Net decoder synthesizes the final virtual try-on image. Our experimental results demonstrate that ClothFit can significantly improve the existing state-of-the-art methods in terms of photo-realistic virtual try-on results.
△ Less
Submitted 24 June, 2023;
originally announced June 2023.
-
MeciFace: Mechanomyography and Inertial Fusion-based Glasses for Edge Real-Time Recognition of Facial and Eating Activities
Authors:
Hymalai Bello,
Sungho Suh,
Bo Zhou,
Paul Lukowicz
Abstract:
The increasing prevalence of stress-related eating behaviors and their impact on overall health highlights the importance of effective and ubiquitous monitoring systems. In this paper, we present MeciFace, an innovative wearable technology designed to monitor facial expressions and eating activities in real-time on-the-edge (RTE). MeciFace aims to provide a low-power, privacy-conscious, and highly…
▽ More
The increasing prevalence of stress-related eating behaviors and their impact on overall health highlights the importance of effective and ubiquitous monitoring systems. In this paper, we present MeciFace, an innovative wearable technology designed to monitor facial expressions and eating activities in real-time on-the-edge (RTE). MeciFace aims to provide a low-power, privacy-conscious, and highly accurate tool for promoting healthy eating behaviors and stress management. We employ lightweight convolutional neural networks as backbone models for facial expression and eating monitoring scenarios. The MeciFace system ensures efficient data processing with a tiny memory footprint, ranging from 11KB to 19 KB. During RTE evaluation, the system achieves an F1-score of < 86% for facial expression recognition and 94% for eating/drinking monitoring, for the RTE of unseen users (user-independent case).
△ Less
Submitted 3 April, 2024; v1 submitted 19 June, 2023;
originally announced June 2023.
-
Unsupervised Statistical Feature-Guided Diffusion Model for Sensor-based Human Activity Recognition
Authors:
Si Zuo,
Vitor Fortes Rey,
Sungho Suh,
Stephan Sigg,
Paul Lukowicz
Abstract:
Human activity recognition (HAR) from on-body sensors is a core functionality in many AI applications: from personal health, through sports and wellness to Industry 4.0. A key problem holding up progress in wearable sensor-based HAR, compared to other ML areas, such as computer vision, is the unavailability of diverse and labeled training data. Particularly, while there are innumerable annotated i…
▽ More
Human activity recognition (HAR) from on-body sensors is a core functionality in many AI applications: from personal health, through sports and wellness to Industry 4.0. A key problem holding up progress in wearable sensor-based HAR, compared to other ML areas, such as computer vision, is the unavailability of diverse and labeled training data. Particularly, while there are innumerable annotated images available in online repositories, freely available sensor data is sparse and mostly unlabeled. We propose an unsupervised statistical feature-guided diffusion model specifically optimized for wearable sensor-based human activity recognition with devices such as inertial measurement unit (IMU) sensors. The method generates synthetic labeled time-series sensor data without relying on annotated training data. Thereby, it addresses the scarcity and annotation difficulties associated with real-world sensor data. By conditioning the diffusion model on statistical information such as mean, standard deviation, Z-score, and skewness, we generate diverse and representative synthetic sensor data. We conducted experiments on public human activity recognition datasets and compared the method to conventional oversampling and state-of-the-art generative adversarial network methods. Experimental results demonstrate that this can improve the performance of human activity recognition and outperform existing techniques.
△ Less
Submitted 19 May, 2024; v1 submitted 30 May, 2023;
originally announced June 2023.
-
CaptAinGlove: Capacitive and Inertial Fusion-Based Glove for Real-Time on Edge Hand Gesture Recognition for Drone Control
Authors:
Hymalai Bello,
Sungho Suh,
Daniel Geißler,
Lala Ray,
Bo Zhou,
Paul Lukowicz
Abstract:
We present CaptAinGlove, a textile-based, low-power (1.15Watts), privacy-conscious, real-time on-the-edge (RTE) glove-based solution with a tiny memory footprint (2MB), designed to recognize hand gestures used for drone control. We employ lightweight convolutional neural networks as the backbone models and a hierarchical multimodal fusion to reduce power consumption and improve accuracy. The syste…
▽ More
We present CaptAinGlove, a textile-based, low-power (1.15Watts), privacy-conscious, real-time on-the-edge (RTE) glove-based solution with a tiny memory footprint (2MB), designed to recognize hand gestures used for drone control. We employ lightweight convolutional neural networks as the backbone models and a hierarchical multimodal fusion to reduce power consumption and improve accuracy. The system yields an F1-score of 80% for the offline evaluation of nine classes; eight hand gesture commands and null activity. For the RTE, we obtained an F1-score of 67% (one user).
△ Less
Submitted 7 June, 2023;
originally announced June 2023.
-
Chemical Property-Guided Neural Networks for Naphtha Composition Prediction
Authors:
Chonghyo Joo,
Jeongdong Kim,
Hyungtae Cho,
Jaewon Lee,
Sungho Suh,
Junghwan Kim
Abstract:
The naphtha cracking process heavily relies on the composition of naphtha, which is a complex blend of different hydrocarbons. Predicting the naphtha composition accurately is crucial for efficiently controlling the cracking process and achieving maximum performance. Traditional methods, such as gas chromatography and true boiling curve, are not feasible due to the need for pilot-plant-scale exper…
▽ More
The naphtha cracking process heavily relies on the composition of naphtha, which is a complex blend of different hydrocarbons. Predicting the naphtha composition accurately is crucial for efficiently controlling the cracking process and achieving maximum performance. Traditional methods, such as gas chromatography and true boiling curve, are not feasible due to the need for pilot-plant-scale experiments or cost constraints. In this paper, we propose a neural network framework that utilizes chemical property information to improve the performance of naphtha composition prediction. Our proposed framework comprises two parts: a Watson K factor estimation network and a naphtha composition prediction network. Both networks share a feature extraction network based on Convolutional Neural Network (CNN) architecture, while the output layers use Multi-Layer Perceptron (MLP) based networks to generate two different outputs - Watson K factor and naphtha composition. The naphtha composition is expressed in percentages, and its sum should be 100%. To enhance the naphtha composition prediction, we utilize a distillation simulator to obtain the distillation curve from the naphtha composition, which is dependent on its chemical properties. By designing a loss function between the estimated and simulated Watson K factors, we improve the performance of both Watson K estimation and naphtha composition prediction. The experimental results show that our proposed framework can predict the naphtha composition accurately while reflecting real naphtha chemical properties.
△ Less
Submitted 2 June, 2023;
originally announced June 2023.
-
Cheat Sheet for Teaching Programming with Comics: Through the Lens of Concept-Language-Procedure Framework
Authors:
Sangho Suh
Abstract:
Comics is emerging as a popular medium for providing visual explanations of programming concepts and procedures. Recent research into this medium opened the door to new opportunities and tools to advance teaching and learning in computing. For instance, recent research on coding strip, a form of comic strip with its corresponding code, led to a new visual programming environment that generates com…
▽ More
Comics is emerging as a popular medium for providing visual explanations of programming concepts and procedures. Recent research into this medium opened the door to new opportunities and tools to advance teaching and learning in computing. For instance, recent research on coding strip, a form of comic strip with its corresponding code, led to a new visual programming environment that generates comics from code and experience report detailing various ways coding strips can be used to benefit students' learning. However, how comics can be designed and used to teach programming has not yet been documented in a concise, accessible format to ease their adoption. To fill this gap, we developed a cheat sheet that summarizes the pedagogical techniques and designs teachers can use in their teaching. To develop this cheat sheet, we analyzed prior work on coding strip, including 26 coding strips and 30 coding strip design patterns. We also formulated a concept-language-procedure framework to delineate how comics can facilitate teaching in programming. To evaluate our cheat sheet, we presented it to 11 high school CS teachers at an annual conference for computer studies educators and asked them to rate its readability, usefulness, organization, and their interest in using it for their teaching. Our analysis suggests that this cheat sheet is easy to read/understand, useful, well-structured, and interests teachers to further explore how they can incorporate comics into their teaching.
△ Less
Submitted 1 June, 2023;
originally announced June 2023.
-
FieldHAR: A Fully Integrated End-to-end RTL Framework for Human Activity Recognition with Neural Networks from Heterogeneous Sensors
Authors:
Mengxi Liu,
Bo Zhou,
Zimin Zhao,
Hyeonseok Hong,
Hyun Kim,
Sungho Suh,
Vitor Fortes Rey,
Paul Lukowicz
Abstract:
In this work, we propose an open-source scalable end-to-end RTL framework FieldHAR, for complex human activity recognition (HAR) from heterogeneous sensors using artificial neural networks (ANN) optimized for FPGA or ASIC integration. FieldHAR aims to address the lack of apparatus to transform complex HAR methodologies often limited to offline evaluation to efficient run-time edge applications. Th…
▽ More
In this work, we propose an open-source scalable end-to-end RTL framework FieldHAR, for complex human activity recognition (HAR) from heterogeneous sensors using artificial neural networks (ANN) optimized for FPGA or ASIC integration. FieldHAR aims to address the lack of apparatus to transform complex HAR methodologies often limited to offline evaluation to efficient run-time edge applications. The framework uses parallel sensor interfaces and integer-based multi-branch convolutional neural networks (CNNs) to support flexible modality extensions with synchronous sampling at the maximum rate of each sensor. To validate the framework, we used a sensor-rich kitchen scenario HAR application which was demonstrated in a previous offline study. Through resource-aware optimizations, with FieldHAR the entire RTL solution was created from data acquisition to ANN inference taking as low as 25\% logic elements and 2\% memory bits of a low-end Cyclone IV FPGA and less than 1\% accuracy loss from the original FP32 precision offline study. The RTL implementation also shows advantages over MCU-based solutions, including superior data acquisition performance and virtually eliminating ANN inference bottleneck.
△ Less
Submitted 22 May, 2023;
originally announced May 2023.
-
Sensecape: Enabling Multilevel Exploration and Sensemaking with Large Language Models
Authors:
Sangho Suh,
Bryan Min,
Srishti Palani,
Haijun Xia
Abstract:
People are increasingly turning to large language models (LLMs) for complex information tasks like academic research or planning a move to another city. However, while they often require working in a nonlinear manner -- e.g., to arrange information spatially to organize and make sense of it, current interfaces for interacting with LLMs are generally linear to support conversational interaction. To…
▽ More
People are increasingly turning to large language models (LLMs) for complex information tasks like academic research or planning a move to another city. However, while they often require working in a nonlinear manner -- e.g., to arrange information spatially to organize and make sense of it, current interfaces for interacting with LLMs are generally linear to support conversational interaction. To address this limitation and explore how we can support LLM-powered exploration and sensemaking, we developed Sensecape, an interactive system designed to support complex information tasks with an LLM by enabling users to (1) manage the complexity of information through multilevel abstraction and (2) seamlessly switch between foraging and sensemaking. Our within-subject user study reveals that Sensecape empowers users to explore more topics and structure their knowledge hierarchically, thanks to the externalization of levels of abstraction. We contribute implications for LLM-based workflows and interfaces for information tasks.
△ Less
Submitted 29 August, 2023; v1 submitted 19 May, 2023;
originally announced May 2023.
-
Learning Graph Patterns of Reflection Coefficient for Non-destructive Diagnosis of Cu Interconnects
Authors:
Tae Yeob Kang,
Haebom Lee,
Sungho Suh
Abstract:
With the increasing operating frequencies and clock speeds in processors, interconnects affect both the reliability and performance of entire electronic systems. Fault detection and diagnosis of the interconnects are crucial for prognostics and health management (PHM) of electronics. However, traditional approaches using electrical signals as prognostic factors often face challenges in distinguish…
▽ More
With the increasing operating frequencies and clock speeds in processors, interconnects affect both the reliability and performance of entire electronic systems. Fault detection and diagnosis of the interconnects are crucial for prognostics and health management (PHM) of electronics. However, traditional approaches using electrical signals as prognostic factors often face challenges in distinguishing defect root causes, necessitating additional destructive evaluations, and are prone to noise interference, leading to potential false alarms. To address these limitations, this paper introduces a novel approach for non-destructive detection and diagnosis of defects in Cu interconnects, offering early detection, enhanced diagnostic accuracy, and noise resilience. Our approach uniquely analyzes both the root cause and severity of interconnect defects by leveraging graph patterns of reflection coefficient, a technique distinct from traditional time series signal analysis. We experimentally demonstrate that the graph patterns possess the capability for fault diagnosis and serve as effective input data for learning algorithms. Additionally, we introduce a novel severity rating ensemble learning (SREL) approach, which significantly enhances diagnostic accuracy and noise robustness. Experimental results demonstrate that the proposed method outperforms conventional machine learning methods and multi-class convolutional neural networks (CNN), achieving a maximum accuracy of 99.3%, especially under elevated noise levels.
△ Less
Submitted 9 July, 2023; v1 submitted 20 April, 2023;
originally announced April 2023.
-
Development of a thorium coating on an aluminium substrate by using electrodeposition method and alpha spectroscopy
Authors:
Dal-Ho Moon,
Vivek Chavan,
Vasant Bhoraskar,
Yeong Hoon Jeong,
Jung Ho Park,
Su-Jeong Suh,
Seung-Woo Hong
Abstract:
A thin coating of thorium on aluminium substrates with the areal density of 110 to 130 $μg/cm^2$ is developed over a circular area of 22 mm diameter by using the electrodeposition method. An electrodeposition system is fabricated to consist of three components; an anode made of a platinum mesh, a cylindrical-shape vessel to contain the thorium solution, and a cathode in the form of a circular alum…
▽ More
A thin coating of thorium on aluminium substrates with the areal density of 110 to 130 $μg/cm^2$ is developed over a circular area of 22 mm diameter by using the electrodeposition method. An electrodeposition system is fabricated to consist of three components; an anode made of a platinum mesh, a cylindrical-shape vessel to contain the thorium solution, and a cathode in the form of a circular aluminium plate. The aluminium plate is mounted horizontally, and the platinum mesh is connected to an axial rod of an electric motor, mounted vertically and normal to the plane of the aluminium. The electrolyte solution is prepared by dissolving a known-weight thorium nitrate powder in 0.8 M HNO3 and isopropanol. The system is operated either in constant voltage (CV) or constant current (CC) mode. Under the electric field between the anode and cathode, thorium ions were deposited on the aluminium substrate mounted on the cathode. In the CV mode at 320, 360, and 400 V and in the CC mode at 15 mA, thorium films were formed over a circular area of the aluminium substrate. The areal density of thorium coating was measured by detecting emitted alpha particles. The areal density of thorium varied from 80 to 130 $μg/cm^2$ by changing the deposition time from 10 to 60 min. The results from the CV mode and CC mode are compared, and the radial dependence in the measured areal density is discussed for different modes of the electric field. The developed thorium coatings are to be used in the in-house development of particle detectors, fast neutron converters, targets for thorium fission experiments, and other purposes.
△ Less
Submitted 11 March, 2023;
originally announced March 2023.
-
A Knowledge Distillation framework for Multi-Organ Segmentation of Medaka Fish in Tomographic Image
Authors:
Jwalin Bhatt,
Yaroslav Zharov,
Sungho Suh,
Tilo Baumbach,
Vincent Heuveline,
Paul Lukowicz
Abstract:
Morphological atlases are an important tool in organismal studies, and modern high-throughput Computed Tomography (CT) facilities can produce hundreds of full-body high-resolution volumetric images of organisms. However, creating an atlas from these volumes requires accurate organ segmentation. In the last decade, machine learning approaches have achieved incredible results in image segmentation t…
▽ More
Morphological atlases are an important tool in organismal studies, and modern high-throughput Computed Tomography (CT) facilities can produce hundreds of full-body high-resolution volumetric images of organisms. However, creating an atlas from these volumes requires accurate organ segmentation. In the last decade, machine learning approaches have achieved incredible results in image segmentation tasks, but they require large amounts of annotated data for training. In this paper, we propose a self-training framework for multi-organ segmentation in tomographic images of Medaka fish. We utilize the pseudo-labeled data from a pretrained Teacher model and adopt a Quality Classifier to refine the pseudo-labeled data. Then, we introduce a pixel-wise knowledge distillation method to prevent overfitting to the pseudo-labeled data and improve the segmentation performance. The experimental results demonstrate that our method improves mean Intersection over Union (IoU) by 5.9% on the full dataset and enables keeping the quality while using three times less markup.
△ Less
Submitted 24 February, 2023;
originally announced February 2023.
-
InMyFace: Inertial and Mechanomyography-Based Sensor Fusion for Wearable Facial Activity Recognition
Authors:
Hymalai Bello,
Luis Alfredo Sanchez Marin,
Sungho Suh,
Bo Zhou,
Paul Lukowicz
Abstract:
Recognizing facial activity is a well-understood (but non-trivial) computer vision problem. However, reliable solutions require a camera with a good view of the face, which is often unavailable in wearable settings. Furthermore, in wearable applications, where systems accompany users throughout their daily activities, a permanently running camera can be problematic for privacy (and legal) reasons.…
▽ More
Recognizing facial activity is a well-understood (but non-trivial) computer vision problem. However, reliable solutions require a camera with a good view of the face, which is often unavailable in wearable settings. Furthermore, in wearable applications, where systems accompany users throughout their daily activities, a permanently running camera can be problematic for privacy (and legal) reasons. This work presents an alternative solution based on the fusion of wearable inertial sensors, planar pressure sensors, and acoustic mechanomyography (muscle sounds). The sensors were placed unobtrusively in a sports cap to monitor facial muscle activities related to facial expressions. We present our integrated wearable sensor system, describe data fusion and analysis methods, and evaluate the system in an experiment with thirteen subjects from different cultural backgrounds (eight countries) and both sexes (six women and seven men). In a one-model-per-user scheme and using a late fusion approach, the system yielded an average F1 score of 85.00% for the case where all sensing modalities are combined. With a cross-user validation and a one-model-for-all-user scheme, an F1 score of 79.00% was obtained for thirteen participants (six females and seven males). Moreover, in a hybrid fusion (cross-user) approach and six classes, an average F1 score of 82.00% was obtained for eight users. The results are competitive with state-of-the-art non-camera-based solutions for a cross-user study. In addition, our unique set of participants demonstrates the inclusiveness and generalizability of the approach.
△ Less
Submitted 8 February, 2023;
originally announced February 2023.
-
PresSim: An End-to-end Framework for Dynamic Ground Pressure Profile Generation from Monocular Videos Using Physics-based 3D Simulation
Authors:
Lala Shakti Swarup Ray,
Bo Zhou,
Sungho Suh,
Paul Lukowicz
Abstract:
Ground pressure exerted by the human body is a valuable source of information for human activity recognition (HAR) in unobtrusive pervasive sensing. While data collection from pressure sensors to develop HAR solutions requires significant resources and effort, we present a novel end-to-end framework, PresSim, to synthesize sensor data from videos of human activities to reduce such effort significa…
▽ More
Ground pressure exerted by the human body is a valuable source of information for human activity recognition (HAR) in unobtrusive pervasive sensing. While data collection from pressure sensors to develop HAR solutions requires significant resources and effort, we present a novel end-to-end framework, PresSim, to synthesize sensor data from videos of human activities to reduce such effort significantly. PresSim adopts a 3-stage process: first, extract the 3D activity information from videos with computer vision architectures; then simulate the floor mesh deformation profiles based on the 3D activity information and gravity-included physics simulation; lastly, generate the simulated pressure sensor data with deep learning models. We explored two approaches for the 3D activity information: inverse kinematics with mesh re-targeting, and volumetric pose and shape estimation. We validated PresSim with an experimental setup with a monocular camera to provide input and a pressure-sensing fitness mat (80x28 spatial resolution) to provide the sensor ground truth, where nine participants performed a set of predefined yoga sequences.
△ Less
Submitted 1 February, 2023;
originally announced February 2023.
-
AI-KD: Adversarial learning and Implicit regularization for self-Knowledge Distillation
Authors:
Hyungmin Kim,
Sungho Suh,
Sunghyun Baek,
Daehwan Kim,
Daun Jeong,
Hansang Cho,
Junmo Kim
Abstract:
We present a novel adversarial penalized self-knowledge distillation method, named adversarial learning and implicit regularization for self-knowledge distillation (AI-KD), which regularizes the training procedure by adversarial learning and implicit distillations. Our model not only distills the deterministic and progressive knowledge which are from the pre-trained and previous epoch predictive p…
▽ More
We present a novel adversarial penalized self-knowledge distillation method, named adversarial learning and implicit regularization for self-knowledge distillation (AI-KD), which regularizes the training procedure by adversarial learning and implicit distillations. Our model not only distills the deterministic and progressive knowledge which are from the pre-trained and previous epoch predictive probabilities but also transfers the knowledge of the deterministic predictive distributions using adversarial learning. The motivation is that the self-knowledge distillation methods regularize the predictive probabilities with soft targets, but the exact distributions may be hard to predict. Our method deploys a discriminator to distinguish the distributions between the pre-trained and student models while the student model is trained to fool the discriminator in the trained procedure. Thus, the student model not only can learn the pre-trained model's predictive probabilities but also align the distributions between the pre-trained and student models. We demonstrate the effectiveness of the proposed method with network architectures on multiple datasets and show the proposed method achieves better performance than state-of-the-art methods.
△ Less
Submitted 21 March, 2024; v1 submitted 20 November, 2022;
originally announced November 2022.
-
Learning from the Best: Contrastive Representations Learning Across Sensor Locations for Wearable Activity Recognition
Authors:
Vitor Fortes Rey,
Sungho Suh,
Paul Lukowicz
Abstract:
We address the well-known wearable activity recognition problem of having to work with sensors that are non-optimal in terms of information they provide but have to be used due to wearability/usability concerns (e.g. the need to work with wrist-worn IMUs because they are embedded in most smart watches). To mitigate this problem we propose a method that facilitates the use of information from senso…
▽ More
We address the well-known wearable activity recognition problem of having to work with sensors that are non-optimal in terms of information they provide but have to be used due to wearability/usability concerns (e.g. the need to work with wrist-worn IMUs because they are embedded in most smart watches). To mitigate this problem we propose a method that facilitates the use of information from sensors that are only present during the training process and are unavailable during the later use of the system. The method transfers information from the source sensors to the latent representation of the target sensor data through contrastive loss that is combined with the classification loss during joint training. We evaluate the method on the well-known PAMAP2 and Opportunity benchmarks for different combinations of source and target sensors showing average (over all activities) F1 score improvements of between 5% and 13% with the improvement on individual activities, particularly well suited to benefit from the additional information going up to between 20% and 40%.
△ Less
Submitted 4 October, 2022;
originally announced October 2022.
-
Smart-Badge: A wearable badge with multi-modal sensors for kitchen activity recognition
Authors:
Mengxi Liu,
Sungho Suh,
Bo Zhou,
Agnes Gruenerbl,
Paul Lukowicz
Abstract:
Human health is closely associated with their daily behavior and environment. However, keeping a healthy lifestyle is still challenging for most people as it is difficult to recognize their living behaviors and identify their surrounding situations to take appropriate action. Human activity recognition is a promising approach to building a behavior model of users, by which users can get feedback a…
▽ More
Human health is closely associated with their daily behavior and environment. However, keeping a healthy lifestyle is still challenging for most people as it is difficult to recognize their living behaviors and identify their surrounding situations to take appropriate action. Human activity recognition is a promising approach to building a behavior model of users, by which users can get feedback about their habits and be encouraged to develop a healthier lifestyle. In this paper, we present a smart light wearable badge with six kinds of sensors, including an infrared array sensor MLX90640 offering privacy-preserving, low-cost, and non-invasive features, to recognize daily activities in a realistic unmodified kitchen environment. A multi-channel convolutional neural network (MC-CNN) based on data and feature fusion methods is applied to classify 14 human activities associated with potentially unhealthy habits. Meanwhile, we evaluate the impact of the infrared array sensor on the recognition accuracy of these activities. We demonstrate the performance of the proposed work to detect the 14 activities performed by ten volunteers with an average accuracy of 92.44 % and an F1 score of 88.27 %.
△ Less
Submitted 12 January, 2023; v1 submitted 3 October, 2022;
originally announced October 2022.
-
The ReturnZero System for VoxCeleb Speaker Recognition Challenge 2022
Authors:
Sangwon Suh,
Sunjong Park
Abstract:
In this paper, we describe the top-scoring submissions for team RTZR VoxCeleb Speaker Recognition Challenge 2022 (VoxSRC-22) in the closed dataset, speaker verification Track 1. The top performed system is a fusion of 7 models, which contains 3 different types of model architectures. We focus on training models to learn extra-temporal information. Therefore, all models were trained with 4-6 second…
▽ More
In this paper, we describe the top-scoring submissions for team RTZR VoxCeleb Speaker Recognition Challenge 2022 (VoxSRC-22) in the closed dataset, speaker verification Track 1. The top performed system is a fusion of 7 models, which contains 3 different types of model architectures. We focus on training models to learn extra-temporal information. Therefore, all models were trained with 4-6 second frames for each utterance. Also, we apply the Large Margin Fine-tuning strategy which has shown good performance on the previous challenges for some of our fusion models. While the evaluation process, we apply the scoring methods with adaptive symmetric normalization (AS-Norm) and matrix score average (MSA). Finally, we mix up models with logistic regression to fuse all the trained models. The final submission achieves 0.165 DCF and 2.912% EER on the VoxSRC22 test set.
△ Less
Submitted 21 September, 2022;
originally announced September 2022.
-
TASKED: Transformer-based Adversarial learning for human activity recognition using wearable sensors via Self-KnowledgE Distillation
Authors:
Sungho Suh,
Vitor Fortes Rey,
Paul Lukowicz
Abstract:
Wearable sensor-based human activity recognition (HAR) has emerged as a principal research area and is utilized in a variety of applications. Recently, deep learning-based methods have achieved significant improvement in the HAR field with the development of human-computer interaction applications. However, they are limited to operating in a local neighborhood in the process of a standard convolut…
▽ More
Wearable sensor-based human activity recognition (HAR) has emerged as a principal research area and is utilized in a variety of applications. Recently, deep learning-based methods have achieved significant improvement in the HAR field with the development of human-computer interaction applications. However, they are limited to operating in a local neighborhood in the process of a standard convolution neural network, and correlations between different sensors on body positions are ignored. In addition, they still face significant challenging problems with performance degradation due to large gaps in the distribution of training and test data, and behavioral differences between subjects. In this work, we propose a novel Transformer-based Adversarial learning framework for human activity recognition using wearable sensors via Self-KnowledgE Distillation (TASKED), that accounts for individual sensor orientations and spatial and temporal features. The proposed method is capable of learning cross-domain embedding feature representations from multiple subjects datasets using adversarial learning and the maximum mean discrepancy (MMD) regularization to align the data distribution over multiple domains. In the proposed method, we adopt the teacher-free self-knowledge distillation to improve the stability of the training procedure and the performance of human activity recognition. Experimental results show that TASKED not only outperforms state-of-the-art methods on the four real-world public HAR datasets (alone or combined) but also improves the subject generalization effectively.
△ Less
Submitted 8 December, 2022; v1 submitted 14 September, 2022;
originally announced September 2022.
-
CodeToon: Story Ideation, Auto Comic Generation, and Structure Mapping for Code-Driven Storytelling
Authors:
Sangho Suh,
Jian Zhao,
Edith Law
Abstract:
Recent work demonstrated how we can design and use coding strips, a form of comic strips with corresponding code, to enhance teaching and learning in programming. However, creating coding strips is a creative, time-consuming process. Creators have to generate stories from code (code->story) and design comics from stories (story->comic). We contribute CodeToon, a comic authoring tool that facilitat…
▽ More
Recent work demonstrated how we can design and use coding strips, a form of comic strips with corresponding code, to enhance teaching and learning in programming. However, creating coding strips is a creative, time-consuming process. Creators have to generate stories from code (code->story) and design comics from stories (story->comic). We contribute CodeToon, a comic authoring tool that facilitates this code-driven storytelling process with two mechanisms: (1) story ideation from code using metaphor and (2) automatic comic generation from the story. We conducted a two-part user study that evaluates the tool and the comics generated by participants to test whether CodeToon facilitates the authoring process and helps generate quality comics. Our results show that CodeToon helps users create accurate, informative, and useful coding strips in a significantly shorter time. Overall, this work contributes methods and design guidelines for code-driven storytelling and opens up opportunities for using art to support computer science education.
△ Less
Submitted 27 August, 2022;
originally announced August 2022.
-
Probabilistic deconstruction of a theory of gravity, Part II: curved space
Authors:
S. Josephine Suh
Abstract:
We propose that the underlying context of holographic duality and the Ryu-Takayanagi formula is that the volume measure of spacetime is a probability measure constrained by quantum dynamics. We define quantum stochastic processes using joint quantum distributions which are realized in a quantum system as expectation values of products of projectors. In anti-de Sitter JT gravity, we show that Einst…
▽ More
We propose that the underlying context of holographic duality and the Ryu-Takayanagi formula is that the volume measure of spacetime is a probability measure constrained by quantum dynamics. We define quantum stochastic processes using joint quantum distributions which are realized in a quantum system as expectation values of products of projectors. In anti-de Sitter JT gravity, we show that Einstein's equations arise from the evolution of probability under the quantum stochastic process induced by the boundary, with the area of compactified space in the gravitational theory identified as a probability density evolving under the quantum process. Extrapolating these and related results in flat JT gravity found in arXiv:2108.10916, we conjecture that general relativity arises in the semi-classical limit of the evolution of probability with respect to quantum stochastic processes.
△ Less
Submitted 12 November, 2023; v1 submitted 25 August, 2022;
originally announced August 2022.
-
Estimation of 3D Body Shape and Clothing Measurements from Frontal- and Side-view Images
Authors:
Kundan Sai Prabhu Thota,
Sungho Suh,
Bo Zhou,
Paul Lukowicz
Abstract:
The estimation of 3D human body shape and clothing measurements is crucial for virtual try-on and size recommendation problems in the fashion industry but has always been a challenging problem due to several conditions, such as lack of publicly available realistic datasets, ambiguity in multiple camera resolutions, and the undefinable human shape space. Existing works proposed various solutions to…
▽ More
The estimation of 3D human body shape and clothing measurements is crucial for virtual try-on and size recommendation problems in the fashion industry but has always been a challenging problem due to several conditions, such as lack of publicly available realistic datasets, ambiguity in multiple camera resolutions, and the undefinable human shape space. Existing works proposed various solutions to these problems but could not succeed in the industry adaptation because of complexity and restrictions. To solve the complexity and challenges, in this paper, we propose a simple yet effective architecture to estimate both shape and measures from frontal- and side-view images. We utilize silhouette segmentation from the two multi-view images and implement an auto-encoder network to learn low-dimensional features from segmented silhouettes. Then, we adopt a kernel-based regularized regression module to estimate the body shape and measurements. The experimental results show that the proposed method provides competitive results on the synthetic dataset, NOMO-3d-400-scans Dataset, and RGB Images of humans captured in different cameras.
△ Less
Submitted 28 May, 2022;
originally announced May 2022.
-
Human-Centric Artificial Intelligence Architecture for Industry 5.0 Applications
Authors:
Jože M. Rožanec,
Inna Novalija,
Patrik Zajec,
Klemen Kenda,
Hooman Tavakoli,
Sungho Suh,
Entso Veliou,
Dimitrios Papamartzivanos,
Thanassis Giannetsos,
Sofia Anna Menesidou,
Ruben Alonso,
Nino Cauli,
Antonello Meloni,
Diego Reforgiato Recupero,
Dimosthenis Kyriazis,
Georgios Sofianidis,
Spyros Theodoropoulos,
Blaž Fortuna,
Dunja Mladenić,
John Soldatos
Abstract:
Human-centricity is the core value behind the evolution of manufacturing towards Industry 5.0. Nevertheless, there is a lack of architecture that considers safety, trustworthiness, and human-centricity at its core. Therefore, we propose an architecture that integrates Artificial Intelligence (Active Learning, Forecasting, Explainable Artificial Intelligence), simulated reality, decision-making, an…
▽ More
Human-centricity is the core value behind the evolution of manufacturing towards Industry 5.0. Nevertheless, there is a lack of architecture that considers safety, trustworthiness, and human-centricity at its core. Therefore, we propose an architecture that integrates Artificial Intelligence (Active Learning, Forecasting, Explainable Artificial Intelligence), simulated reality, decision-making, and users' feedback, focusing on synergies between humans and machines. Furthermore, we align the proposed architecture with the Big Data Value Association Reference Architecture Model. Finally, we validate it on three use cases from real-world case studies.
△ Less
Submitted 19 October, 2022; v1 submitted 21 March, 2022;
originally announced March 2022.
-
Multi-phase-field microporomechanics model for simulating ice lens growth and thaw in frozen soil
Authors:
Hyoung Suk Suh,
WaiChing Sun
Abstract:
This article presents a multi-phase-field poromechanics model that simulates the growth and thaw of ice lenses and the resultant frozen heave and thaw settlement in multi-constituent frozen soils. In this model, the growth of segregated ice inside the freezing-induced fracture is implicitly represented by the evolution of two phase fields that indicate the locations of segregated ice and the damag…
▽ More
This article presents a multi-phase-field poromechanics model that simulates the growth and thaw of ice lenses and the resultant frozen heave and thaw settlement in multi-constituent frozen soils. In this model, the growth of segregated ice inside the freezing-induced fracture is implicitly represented by the evolution of two phase fields that indicate the locations of segregated ice and the damaged zone, respectively. The evolution of two phase fields are driven by the driving forces that capture the physical mechanisms of ice and crack growths respectively, while the phase field governing equations are coupled with the balance laws such that the coupling among heat transfer, solid deformation, fluid diffusion, crack growth, and phase transition can be observed numerically. Unlike phenomenological approaches that indirectly captures the freezing influence on the shear strength, the multi-phase-field model introduces an immersed approach where both the homogeneous freezing and the ice lens growth are distinctively captured by the freezing characteristic function and the driving force accordingly. Verification and validation examples are provided to demonstrate the capacities of the proposed models.
△ Less
Submitted 29 November, 2021;
originally announced November 2021.
-
Adversarial Deep Feature Extraction Network for User Independent Human Activity Recognition
Authors:
Sungho Suh,
Vitor Fortes Rey,
Paul Lukowicz
Abstract:
User dependence remains one of the most difficult general problems in Human Activity Recognition (HAR), in particular when using wearable sensors. This is due to the huge variability of the way different people execute even the simplest actions. In addition, detailed sensor fixtures and placement will be different for different people or even at different times for the same users. In theory, the p…
▽ More
User dependence remains one of the most difficult general problems in Human Activity Recognition (HAR), in particular when using wearable sensors. This is due to the huge variability of the way different people execute even the simplest actions. In addition, detailed sensor fixtures and placement will be different for different people or even at different times for the same users. In theory, the problem can be solved by a large enough data set. However, recording data sets that capture the entire diversity of complex activity sets is seldom practicable. Instead, models are needed that focus on features that are invariant across users. To this end, we present an adversarial subject-independent feature extraction method with the maximum mean discrepancy (MMD) regularization for human activity recognition. The proposed model is capable of learning a subject-independent embedding feature representation from multiple subjects datasets and generalizing it to unseen target subjects. The proposed network is based on the adversarial encoder-decoder structure with the MMD realign the data distribution over multiple subjects. Experimental results show that the proposed method not only outperforms state-of-the-art methods over the four real-world datasets but also improves the subject generalization effectively. We evaluate the method on well-known public data sets showing that it significantly improves user-independent performance and reduces variance in results.
△ Less
Submitted 23 October, 2021;
originally announced October 2021.
-
Exploring Individual and Collaborative Storytelling in an Introductory Creative Coding Class
Authors:
Sangho Suh,
Ken Jen Lee,
Celine Latulipe,
Jian Zhao,
Edith Law
Abstract:
Teaching programming through storytelling is a popular pedagogical approach and an active area of research. However, most previous work in this area focused on K-12 students using block-based programming. Little, if any, work has examined the approach with university students using text-based programming. This experience report fills this gap. Specifically, we report our experience administering t…
▽ More
Teaching programming through storytelling is a popular pedagogical approach and an active area of research. However, most previous work in this area focused on K-12 students using block-based programming. Little, if any, work has examined the approach with university students using text-based programming. This experience report fills this gap. Specifically, we report our experience administering three storytelling assignments -- two individual and one collaborative -- in an introductory computer science class with 49 undergraduate students using $\textit{p5.js}$, a text-based programming library for creative coding. Our work contributes an understanding of students' experiences with the three authoring processes and a set of recommendations to improve the administration of and experience with individual and collaborative storytelling with text-based programming.
△ Less
Submitted 29 September, 2021;
originally announced October 2021.
-
Using Comics to Introduce and Reinforce Programming Concepts in CS1
Authors:
Sangho Suh,
Celine Latulipe,
Ken Jen Lee,
Bernadette Cheng,
Edith Law
Abstract:
Recent work investigated the potential of comics to support the teaching and learning of programming concepts and suggested several ways $coding$ $strips$, a form of comic strip with its corresponding code, can be used. Building on this work, we tested the recommended use cases of $coding$ $strip$ in an undergraduate introductory computer science course at a large comprehensive university. At the…
▽ More
Recent work investigated the potential of comics to support the teaching and learning of programming concepts and suggested several ways $coding$ $strips$, a form of comic strip with its corresponding code, can be used. Building on this work, we tested the recommended use cases of $coding$ $strip$ in an undergraduate introductory computer science course at a large comprehensive university. At the end of the course, we surveyed students to assess their experience and found they benefited in various ways. Our work contributes a demonstration of the various ways comics can be used in introductory CS courses and an initial understanding of benefits and challenges with using comics in computing education gleaned from an analysis of students' survey responses and code submissions.
△ Less
Submitted 27 September, 2021; v1 submitted 27 September, 2021;
originally announced September 2021.
-
Generalized multiscale feature extraction for remaining useful life prediction of bearings with generative adversarial networks
Authors:
Sungho Suh,
Paul Lukowicz,
Yong Oh Lee
Abstract:
Bearing is a key component in industrial machinery and its failure may lead to unwanted downtime and economic loss. Hence, it is necessary to predict the remaining useful life (RUL) of bearings. Conventional data-driven approaches of RUL prediction require expert domain knowledge for manual feature extraction and may suffer from data distribution discrepancy between training and test data. In this…
▽ More
Bearing is a key component in industrial machinery and its failure may lead to unwanted downtime and economic loss. Hence, it is necessary to predict the remaining useful life (RUL) of bearings. Conventional data-driven approaches of RUL prediction require expert domain knowledge for manual feature extraction and may suffer from data distribution discrepancy between training and test data. In this study, we propose a novel generalized multiscale feature extraction method with generative adversarial networks. The adversarial training learns the distribution of training data from different bearings and is introduced for health stage division and RUL prediction. To capture the sequence feature from a one-dimensional vibration signal, we adapt a U-Net architecture that reconstructs features to process them with multiscale layers in the generator of the adversarial network. To validate the proposed method, comprehensive experiments on two rotating machinery datasets have been conducted to predict the RUL. The experimental results show that the proposed feature extraction method can effectively predict the RUL and outperforms the conventional RUL prediction approaches based on deep neural networks. The implementation code is available at https://github.com/opensuh/GMFE.
△ Less
Submitted 26 September, 2021;
originally announced September 2021.
-
What Changes Can Large-scale Language Models Bring? Intensive Study on HyperCLOVA: Billions-scale Korean Generative Pretrained Transformers
Authors:
Boseop Kim,
HyoungSeok Kim,
Sang-Woo Lee,
Gichang Lee,
Donghyun Kwak,
Dong Hyeon Jeon,
Sunghyun Park,
Sungju Kim,
Seonhoon Kim,
Dongpil Seo,
Heungsub Lee,
Minyoung Jeong,
Sungjae Lee,
Minsub Kim,
Suk Hyun Ko,
Seokhun Kim,
Taeyong Park,
Jinuk Kim,
Soyoung Kang,
Na-Hyeon Ryu,
Kang Min Yoo,
Minsuk Chang,
Soobin Suh,
Sookyo In,
Jinseong Park
, et al. (12 additional authors not shown)
Abstract:
GPT-3 shows remarkable in-context learning ability of large-scale language models (LMs) trained on hundreds of billion scale data. Here we address some remaining issues less reported by the GPT-3 paper, such as a non-English LM, the performances of different sized models, and the effect of recently introduced prompt optimization on in-context learning. To achieve this, we introduce HyperCLOVA, a K…
▽ More
GPT-3 shows remarkable in-context learning ability of large-scale language models (LMs) trained on hundreds of billion scale data. Here we address some remaining issues less reported by the GPT-3 paper, such as a non-English LM, the performances of different sized models, and the effect of recently introduced prompt optimization on in-context learning. To achieve this, we introduce HyperCLOVA, a Korean variant of 82B GPT-3 trained on a Korean-centric corpus of 560B tokens. Enhanced by our Korean-specific tokenization, HyperCLOVA with our training configuration shows state-of-the-art in-context zero-shot and few-shot learning performances on various downstream tasks in Korean. Also, we show the performance benefits of prompt-based learning and demonstrate how it can be integrated into the prompt engineering pipeline. Then we discuss the possibility of materializing the No Code AI paradigm by providing AI prototyping capabilities to non-experts of ML by introducing HyperCLOVA studio, an interactive prompt engineering interface. Lastly, we demonstrate the potential of our methods with three successful in-house applications.
△ Less
Submitted 28 November, 2021; v1 submitted 9 September, 2021;
originally announced September 2021.
-
Probabilistic deconstruction of a theory of gravity, Part I: flat space
Authors:
S. Josephine Suh
Abstract:
We define and analyze a stochastic process in anti-de Sitter Jackiw-Teitelboim gravity, induced by the quantum dynamics of the boundary and whose random variable takes values in $AdS_2$. With the boundary in a thermal state and for appropriate parameters, we take the asymptotic limit of the quantum process at short time scales and flat space, and show associated classical joint distributions have…
▽ More
We define and analyze a stochastic process in anti-de Sitter Jackiw-Teitelboim gravity, induced by the quantum dynamics of the boundary and whose random variable takes values in $AdS_2$. With the boundary in a thermal state and for appropriate parameters, we take the asymptotic limit of the quantum process at short time scales and flat space, and show associated classical joint distributions have the Markov property. We find that Einstein's equations of the theory, sans the cosmological constant term, arise in the semi-classical limit of the quantum evolution of probability under the asymptotic process. In particular, in flat Jackiw-Teitelboim gravity, the area of compactified space solved for by Einstein's equations can be identified as a probability density evolving under the Markovian process.
△ Less
Submitted 20 September, 2023; v1 submitted 24 August, 2021;
originally announced August 2021.