Directory of audio format
It is possible to run validation on all audio files (and their respective .txt transcripts)
found recursively in two directories --val_audio_dir and --val_txt_dir.
Directory Structure
The audio and transcripts directories should contain the same number of files, and the file names should match. For example, the structure of the directories could be:
audio_dir/
dir1/
file1.wav
file2.wav
txt_dir/
dir1/
file1.txt
file2.txt
The audio and transcript files can be under the same directory.
Running Validation
Using data from directories for validation can be done by parsing the argument
--val_from_dir along with the audio and transcript directories as follows:
scripts/val.sh --val_from_dir --val_audio_dir audio_dir --val_txt_dir txt_dir --dataset_dir /path/to/dataset/dir
where the audio_dir and txt_dir are relative to the --dataset_dir.
When training on webdataset files (--read_from_tar=True in the train.py), validation on directories is not supported.