Colin Nagy | June 7, 2023
The Sorani Kurdish Edition
On language, culture, and preservation
Colin here. Some languages die over time. As indigenous languages fail to get passed down from generation to generation, continuity and the transmission of culture, customs, and history are damaged.
National Geographic has a project called Enduring Voices, which, in their words, is intended to “document endangered languages and prevent language extinction by identifying the most crucial areas where languages are endangered and embarking on expeditions to (i) Understand the geographic dimensions of language distribution (ii) Determine how linguistic diversity is linked to biodiversity (iii). Bring wide attention to the issue of language loss.”
In addition, linguists the world over have gone to lengths to record, preserve, and port spoken languages to physical media for posterity. It’s a hugely manual process that feels academic, archival, and not sufficiently applied or practical.
Why is this interesting?
Languages also live within our translation apps and are used countless times every day to quickly bridge a gap between our comprehension of each other. But one language was conspicuously missing and a one-man effort managed to fix it.
Rest of World explains:
When Google learned to speak Sorani Kurdish, the company announced it without much fanfare. The Translate team shared the news in a brief blog post last May, revealing Sorani alongside 23 other languages like Twi from Ghana and Dogri from northern India. The post highlighted the new “zero-shot machine translation” system, as well as “the many native speakers, professors and linguists who worked with us.”
In the case of Sorani Kurdish, much of that work came down to one person: a 31-year-old from Halabja named Bokan Hassan, or Bokan Jaff. A soft-spoken man, he graduated from the University of Sulaimani in 2014 with a degree in English. But he struggled to find work after graduation, which led him to a series of piecemeal translation jobs. Neighbors would often come to his house for translation help, noticing that their own language was not included in most automated translation platforms.
Image: Rest of World.
Google Translate needs manual input to make it effective and leaned on Hassan (and his real-life social network) to train its systems. According to the piece, “For some languages, particularly those with fewer living speakers, it can be a challenge to mobilize enough people to ensure that the service is translating long, complex sentences with consistent accuracy.”
Hassan mobilized his community to provide thousands of hours of training for Google Translate. And through his catalyzing the project, persistence, and tapping into an important community, the language will live, online, and also be available for native speakers to bridge the gaps into other tongues around the world. It’s a seismic impact driven by one advocate and some impassioned native speakers. (CJN)
Thanks for reading,
Noah (NRB) & Colin (CJN)
—
Why is this interesting? is a daily email from Noah Brier & Colin Nagy (and friends!) about interesting things. If you’ve enjoyed this edition, please consider forwarding it to a friend. If you’re reading it for the first time, consider subscribing (it’s free!).