

This dataset can be used for several tasks related to hyperspectral document image analysis and document forensic analysis including, handwritten optical character recognition, ink mismatch detection, writer identification at sentence, word, and character-level, handwriting-based gender classification, handwriting-based age prediction, handwritten word segmentation, and word generation. The handwritten notes written by each subject with different pens are annotated in rectangular boxes. The previous methods use synthetic mixed samples created by joining different parts of the images from the UWA WIHSI dataset.Each document consists of real mixed samples written withdifferent pens and by different writers with a variety of mixing ratios of inks and writers for forensic analysis.The standard A4 pages, each weighing 70 gs and manufactured by "AA" company, are used for data collection. Each subject has written twenty-eight sentences using 12 different varieties of pens from different brands in blue color, each approximately 9 words or 33 characters long, all English alphabets in capital and small cases, digits from 0 to 9. The information of age and gender of each individual is collected. All the individuals have different personalities and have their writing patterns. Each hyperspectral cube in the dataset has a spatial resolution of 512 × 650 pixels and contains 149 spectral channels in the spectral range of 478-901 nm. The purpose of the presented dataset is to further explore the use of hyperspectral imaging in document image analysis and to benchmark the performance of forensic analysis methods for hyperspectral document images.

This article presents a dataset of hyperspectral images of handwriting samples collected from 54 individuals.
