Best practice: ML training data for acoustic classifiers

"We want AI in acoustic inspection – how many training samples do we need?" We answer this question weekly. Honest answer: fewer than textbooks say – if you do it right.

Common misconceptions

"More data is always better." Wrong. Poorly structured data degrades the model.
"We need thousands of NOK parts." In reality you rarely have that many – and do not need them.
"AI tunes itself automatically." Only if data foundation is right.

Recommended minimum sample sizes

Model class	OK samples	NOK samples	Note
Classical threshold	30–100	5–20	tolerance build
One-class (anomaly)	200–500	0 (or few)	most common
Binary classifier	500–1,500	100–500	good balance
Multiclass defect model	1,000–3,000	50–200 per class	defect type diagnosis

Five principles

1. Multiple shifts and days: acoustic signals vary with temperature, hall noise, operator, tool wear. At least 3–5 different shifts in training.

2. Sensor variability covered: two parallel sensors → train on both. Replaceable sensor → train on the spare type as well.

3. Provoke NOK parts: reduce tempering temperature, leave tools dull, vary rpm. Combined with few real claim parts builds robust defect set.

4. Manual edge case labelling: the most important 5 % are border cases. Acoustic + part specialist together.

5. Holdout for validation: 20 % of data away from training, only for model evaluation.

Data collection tips

Store every measurement with part ID, timestamp, sensor ID, operator, process parameters.
Archive raw signals – not just features. You will want them later.
Tag OK parts with variant metadata (colour, batch, supplier) for future root-cause analysis.

Best practice: ML training data for acoustic classifiers

Common misconceptions

Recommended minimum sample sizes

Five principles

Data collection tips

Related articles

Ready to experience acoustic test technology in your line?

Before you go – our whitepaper for you.

Common misconceptions

Recommended minimum sample sizes

Five principles

Data collection tips

Related articles

NVH limit setting: statistical methods for inspection specifications

Transfer functions and FRF: the maths of system response

Acoustic testing vs. visual inspection: automation and defect coverage

Ready to experience acoustic test technology in your line?