**TrainValTest7_Random.py :** This script groups all the words into 10 bins (each bin represents 10% of the sum of occurrences of the words in the Train), this will allow you to choose words from all the bins for the test phase (there will be bins where the words are the least repeated in the corpus and bins where the words are the most repeated in the corpus).