solutions for data scarcity

  1. using pretrained networks as feature extractors, (While transfer learning can mitigate data shortages, pre-trained networks are only useful when their inputs are similar to the data that they were trained for)
  2. co training, a classifier that is already trained with an existing labeled dataset is used to classify the unlabeled data. (“Accumulated error is common in co-training, and the newly labeled data has a limited enhancement in classifier performance.” (Mohammadzadeh 等, 2024, p. 2))
  3. active learning, the most informative unlabeled samples are detected and marked by some annotators.
  4. enrich the data, it is necessary to ensure that the augmented data maintains the inherent features of an activity’s signal and constructs a valid output . This procedure is practical but has some limitations due to the fact that the generated data are only slightly different from the real data. Therefore, it can only cover a close space around the data, leading to a limited increase in classifier accuracy.
  5. Learning-based augmentation methods such as generative adversarial networks (GANs)

“Despite considerable advancements in HAR and its numerous practical applications, there exist significant challenges that are inherent and specific to HAR using wearable sensing platforms [16, 20, 61, 63]. The most notable ones are: (i) paucity of labeled data – the time-consuming and expensive nature of wearable data collection along with its inherent privacy concerns have led to datasets being relatively smaller in size; (ii) difficulty in data annotation – ambiguity in the target activities and its context results in incorrectly labeled data; (iii) conflicting variance in data – induced by similar activities being performed differently and different activities resulting in similar sensor readings; and (iv) sensor noise – auto-calibration of sensors that depend on temperature and gravity corrections [78], and the underlying architecture of MEMS sensors [56] inducing noise in the data” (Leng 等, 2024, p. 4)

“the lack of labeled training data.” (Leng 等, 2024, p. 4)
including self-supervised learning [26–28, 67, 73], few-shot learning [21], semi-supervised learning [10], prototypical learning [9, 17], adversarial learning [8, 41], and transfer learning [69]. Recently, the idea of cross modality transfer