Senior Audio AI Engineer - TTS / Speech Synthesis (Remote Contractor) Job at Awarri, Remote

aVV4ZlM2ZE1DSkVQam02MFR0WXA1VDc4eGc9PQ==
  • Awarri
  • Remote

Job Description

At Awarri, our mission is to enable the development and adoption of frontier technology across Africa, starting in Nigeria. We are building inclusive AI technologies—from LLMs to speech models—that reflect and empower African languages and cultural contexts.

Why Join Awarri?

  • Be part of a pioneering initiative shaping the future of AI in Africa.
  • Work on impactful projects that center real-world representation and inclusivity.
  • Collaborate with a passionate, globally distributed team of engineers, linguists, and researchers.

As a Senior Audio AI Engineer at Awarri, you will play a pivotal role in advancing the naturalness and quality of our Text-to-Speech (TTS) systems, focused on African languages and accents. We're seeking an engineer who understands the intricacies of prosody, rhythm, and speech alignment—and is excited to push the boundaries of audio AI in a meaningful cultural context.

This role is best suited for a specialist with deep experience in speech technologies and a passion for building expressive, production-ready TTS models. You'll be joining a collaborative, mission-driven team dedicated to shaping the future of generative audio systems in Africa.

Responsibilities

Model Development & Fine-Tuning

  • Optimize neural TTS models for prosody, pacing, and expressiveness (e.g., Tacotron 2, FastSpeech 2, Glow-TTS, VITS).
  • Improve duration prediction and phoneme-to-frame alignment using forced aligners or prosody-aware training.
  • Incorporate punctuation and linguistic markers into the model pipeline to improve natural flow.
  • Implement and fine-tune transformer-based architectures for speech synthesis and text-to-speech tasks.

Audio Engineering & Vocoder Optimization

  • Evaluate and fine-tune neural vocoders (e.g., HiFi-GAN, WaveGlow) to match desired voice characteristics and audio quality.
  • Identify and correct audio artifacts or inconsistencies in generated speech.
  • Optimize speech processing pipelines for efficiency and real-time performance.

Evaluation & Iteration

  • Lead both objective (e.g., duration errors, pitch contours) and subjective (e.g., MOS scoring) evaluations of TTS quality.
  • Collaborate with linguistic teams to benchmark pronunciation accuracy in Nigerian languages.
  • Develop automated testing frameworks to validate speech synthesis quality at scale.

Deployment & Production Readiness

  • Prepare the TTS system for product integration by improving inference speed and robustness.
  • Support the deployment of models across various platforms (cloud, mobile, embedded).
  • Optimize model inference using VLLM for efficient deployment.
  • Build APIs and backend services for TTS deployment using FastAPI and Flask .
  • Implement and manage data pipelines and storage solutions using MongoDB and MySQL .

Technical Skills & Requirements

  • Proficiency in Python and TypeScript for model development and backend integration.
  • Experience with transformer-based models for speech synthesis and NLP.
  • Strong background in machine learning frameworks such as TensorFlow or PyTorch.
  • Experience in designing scalable AI-driven applications.
  • Familiarity with FastAPI , Flask , and cloud-based deployment environments.
  • Knowledge of database management using MongoDB and MySQL .

Your Experience

Technical Expertise:

  • 3+ years of experience developing and deploying TTS or speech generation systems.
  • Deep knowledge of at least one neural TTS architecture and related vocoders.
  • Proficiency with PyTorch, TensorFlow, or JAX for building and training models.
  • Experience with audio processing tools (e.g., librosa, Praat, torchaudio).

Linguistic & Cultural Sensitivity:

  • Experience working with multilingual or low-resource speech data.
  • Familiarity with phonetics/phonology, especially as it relates to prosody and rhythm.

Engineering Workflow:

  • Experience building scalable training and evaluation pipelines.
  • Ability to debug complex model behavior and iterate quickly toward product quality.
  • Comfort working remotely and asynchronously with interdisciplinary teams.

Nice to Have

  • Prior work on African language speech systems or expressive TTS in non-English languages.
  • Interest in linguistic or cultural technology in the African context.
  • Contributions to open-source TTS or audio AI tools.
  • Experience with emotion modeling or speaker adaptation.

Job Tags

Remote job, Contract work, For contractors,

Similar Jobs

B.E.L. Associates, Inc.

Breast Radiology with 100K signing bonus and beautiful women's center - western PA Job at B.E.L. Associates, Inc.

 ...Diagnostic IMaging into this job if that is preferred by the Radiologist. The Womens' Imaging does do 3D mammography, breast ultrasound, MRI, breast localizations, and biopsy.Complete modern comprehensive breast center in western Pennsylvania across from the Breast... 

RateHawk

Customer Support Representative (English) Job at RateHawk

 ...Hey there! Are you ready to join a fast-growing international travel-tech company and be a part of an awesome team? At RateHawk, were passionate...  ...departments and teams; Logging activity in CRM. Requirements What Youll Do: Advise clients: You will serve as... 

Personal Touch Home Aides of New York

RN Pediatric Home Care (Registered Nurse) Job at Personal Touch Home Aides of New York

RN Pediatric Home Care (Registered Nurse) Location Brooklyn, NY : RN Pediatric Home Care (Registered Nurse) Brooklyn, NY In order...  ...Reimbursement: We make sure you're compensated for your business travel. Opportunities for Professional Growth and Development:... 

Compass Group

PATIENT TRANSPORTER (PART TIME) Job at Compass Group

 ...We are hiring immediately for part time PATIENT TRANSPORTER positions. Address : Ascension Sacred Heart Pensacola - 5151 North Ninth Avenue, Pensacola, FL 32504. Note: online applications accepted only . Schedule :Part time schedules. Days may vary,... 

Belcan

Instructional System Designer Job at Belcan

Job Title: Instructional System DesignerLocation: Madison, MSZip Code: 39110Duration: 4 monthsPay Rate: $44.62/hr.Keyword's...  ...projects, multiple courses or curricula conduct a course and curriculum analysis and recommend a blended learning solution as it pertains...