My Research

Using Machine Learning techniques to boost language acquisition

A Quick Summary:

Firstly, I'm trying to see if it's possible to predict the words a toddler may learn next, by gathering information about their family, environment, current vocabulary and activities. In fact we know it is possible[1] - for roughly the following month - but by recording a lot of extra information and using different techniques, I'm hoping to improve the accuracy of this prediction. Then the next stage is for the system to recommend that the parent concentrates on certain words, with the hope that by learning those words the child may pick up language quicker than normal. Ultimately if it works, then this may have potential use as a technique for helping children that have communication delays (such as those with ASD).

I need as much data as I can get, so I need volunteers! I'm gathering data so I'm interested in hearing from anyone that would be willing to help out. If you have a child aged 6 months to 3 years, I'd love you to help - click here! You'll get access to a phone app, into which you will be asked to enter any new words you hear your child use, and answer a monthly questionnaire.

In more detail:

Children’s level of language acquisition from around the age of two years upwards has been shown to be positively correlated with their later performance at school[2, 3]. It follows that one way to improve a child’s future school performance would be to encourage him or her to acquire language as early as possible. Children prefer to learn words that they can categorise with other words that they already know [4] – firstly through a similarity of shape (the 'shape bias'[5]) and then though other more complex associations[6] as the child's mind creates more categories. It follows, then, that if a system used by the parent – such as a mobile device based application – could be used to log information about the child's language development, it could also give advice to the parent about which words to encourage the child to learn next, those words having been judged by the system to be the best for expanding the child's vocabulary. This may lead to the child learning more language at an earlier age. This could have particular application to children who are already in groups likely to experience a delay in language acquisition (by virtue of demographic factors[7] or for medical reasons).

However, no two children are the same. A child living on a farm may have different environmental influences on their vocabulary compared to a child living in an inner-city area. Two words that may be closely semantically linked in one child's mind may not be linked at all in the mind of another.

Contemporary machine learning techniques combined with cheaper and higher-performance computer hardware have shown great success in moving the field of pattern recognition forward in recent years, and are being used to improve many applications of artificial intelligence. Recent advances include individual patient health prediction based on collective health record data [9, 10, 11]. In general terms, diagnosing possible health problems in a patient based on that patient’s health records and history – with access to a large volume of other patient health data – is analogous to our problem; although instead of predicting the likelihood of a particular health issue, we are predicting the likelihood of the child learning a particular group of words. In our proposed system, machine intelligence techniques should suggest the best words to learn next, based on the child's vocabulary and environmental profile. But it should also be continually learning and attempting to improve itself. Of particular relevance here is the Reinforcement Learning[12] technique, and more specifically Reinforcement Learning with Long Short-Term Memory [13, 14]. So it may be that by using machine learning techniques on our existing child vocabulary data, the individual’s environmental data, and current personal language acquisition data, subtle individualised associative connections can be identified and used to inform the ’advisor’ part of the system.

If successful, the system may also have applications for non-typical subjects such as children with Autism Spectrum Disorder. Ultimately it may even have applications for individuals with brain injuries affecting speech and language.


Dr Floriana Grasso
Dr Terry Payne

Research Question:

Using machine learning methods, is it possible to create a computational model that will predict the words that a given typically-developing child – represented by static and time-varying environmental and vocabulary data – is most likely to learn next? If so, can we use the model to boost the rate of language acquisition? If the model works on typical children, can the model be trained on atypically developing children such as those with ASD?

Phase 1

Initial experimental work

Collection of public data, replication of existing experiments to verify method, testing of different techniques.

Phase 2

Data Collection

Data Collection will be via a phone app. Volunteers will answer a questionnaire about their communication environment, and record words learned by their child along with the timestamp, and a general idea of what the child has been up to. They will subsequently be asked to complete a questionnaire (UK-CDI) via the app at regular intervals.

Phase 3

Data Collection and feedback

Data Collection will again be via a hybrid web app. Volunteers will again answer a questionnaire about their communication environment, and record words learned by their child along with the timestamp, and a general idea of what the child has been up to. This time however the app will encourage the volunteer to concentrate on certain words. They will subsequently be asked to complete a questionnaire (UK-CDI) via the app at regular intervals.

Phase 4


The collected data will be analysed to determine if it is boosting language acquisition.


[1] N. Beckage, PhD Thesis. University Of Colorado.
[2] D. Bleses, G. Makransky, P. S. Dale, A. Hojen, and B. A. Ari, “Early productive vocabulary predicts academic achievement 10 years later,” Applied Psycholinguistics, pp. 1–16, 2016.
[3] D. Walker, C. Greenwood, and B. Hart, “Prediction of School Outcomes Based on Early Language Production and Socioeconomic Factors,” Child Development, vol. 65, no. 2, pp. 606–621, 1994.
[4] A. Borovsky and J. L. Elman, “Language input and semantic categories: A relation between cognition and early word learning,” Journal of Child Language, vol. 33, no. 4, pp. 759–790, 2006.
[5] L. Gershkoff-Stowe and L. B. Smith, “Shape and the first hundred nouns,” Child Development, vol. 75, no. 4, pp. 1098–1114, 2004.
[6] G. Diesendruck, “Mechanisms of Word Learning,” in Blackwell Handbook of Language Development, ch. 13, pp. 257–276, 2008.
[7] B. Hart and T. R. Risley, Meaningful differences in the everyday experience of young American children. Baltimore,MD: Paul H Brookes Publishing, 1995.
[8] Y. Bengio, “Learning Deep Architectures for AI,” Foundations and Trends in Machine Learning, vol. 2, no. 1, pp. 1–127, 2009.
[9] T. Pham, T. Tran, D. Phung, and S. Venkatesh, “DeepCare: A Deep Dynamic Memory Model for Predictive Medicine,” in Advances in Knowledge Discovery and Data Mining: PAKDD 2016 Proceedings (J. Bailey, , L. Khan, , T. Washio, , G. Dobbie, , Z. J. Huang, , and R. Wang, eds.), no. i, pp. 30–41, Springer International Publishing, 2016.
[10] R. Miotto, L. Li, B. A. Kidd, and J. T. Dudley, “Deep Patient: An Unsupervised Representation to Predict the Future of Patients from the Electronic Health Records,” Scientific Reports, vol. 6, 26094, no. April, 2016.
[11] Z. Liang, G. Zhang, J. X. Huang, and Q. V. Hu, “Deep learning for healthcare decision making with EMRs,” Proceedings - 2014 IEEE International Conference on Bioinformatics and Biomedicine, IEEE BIBM 2014, no. Cm, pp. 556–559, 2014.
[12] V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. Riedmiller, “Playing Atari with Deep Reinforcement Learning,” NIPS Deep Learning Workshop, 2013.
[13] B. Bakker, “Reinforcement Learning with Long Short-Term Memory,” Nips, 2002.
[14] M. Hausknecht and P. Stone, “Deep Recurrent Q-Learning for Partially Observable MDPs,” arXiv preprint arXiv:1507.06527, 2015.

Want to help?

Leave us your details.