Journal article
Authors list: Morgenstern, Y; Schmidt, F; Fleming, RW
Publication year: 2019
Pages: 98-108
Journal: Vision Research
Volume number: 165
Open access status: Bronze
DOI Link: https://doi.org/10.1016/j.visres.2019.09.005
Publisher: Elsevier
One aspect of human vision unmatched by machines is the capacity to generalize from few samples. Observers tend to know when novel objects are in the same class despite large differences in shape, material or viewpoint. A major challenge in studying such generalization is that participants can see each novel sample only once. To overcome this, we used crowdsourcing to obtain responses from 500 human observers on 20 novel object classes, with each stimulus compared to 1 or 16 related objects. The results reveal that humans generalize from sparse data in highly systematic ways with the number and variance of the samples. We compared human responses to 'ShapeComp', an image-computable model based on > 100 shape descriptors, and 'AlexNet', a convolution neural network that roughly matches humans at recognizing 1000 categories of real-world objects. With 16 samples, the models were consistent with human responses without free parameters. Thus, when there are a sufficient number of samples, observers rely on shallow but efficient processes based on a fixed set of features. With 1 sample, however, the models required different feature weights for each object. This suggests that one-shot categorization involves more sophisticated processes that actively identify the unique characteristics underlying each object class.
Abstract:
Citation Styles
Harvard Citation style: Morgenstern, Y., Schmidt, F. and Fleming, R. (2019) One-shot categorization of novel object classes in humans, Vision Research, 165, pp. 98-108. https://doi.org/10.1016/j.visres.2019.09.005
APA Citation style: Morgenstern, Y., Schmidt, F., & Fleming, R. (2019). One-shot categorization of novel object classes in humans. Vision Research. 165, 98-108. https://doi.org/10.1016/j.visres.2019.09.005