Well, this is really nice, but how is this different from BPE / char level models? In my practice bag-of-ngram based models worked best, even though I struggled to apply them for NMT.
Well, this is really nice, but how is this different from BPE / char level models? In my practice bag-of-ngram based models worked best, even though I struggled to apply them for NMT.
Data Scientist