We are looking for the best
At 42dot, we're building performance evaluation systems for in-service speech recognition and developing comprehensive training and evaluation datasets for our LLM modules through meticulous data annotation. We strategically collect TTS voice data to ensure a diverse range of authentic, high-quality audio samples. Additionally, we are at the forefront of defining our philosophy on voice design in automotive environments, integrating robust acoustic and user experience principles tailored specifically for vehicle settings. This role will participate in dataset collection and validation to mitigate issues such as data bias and errors.
Responsibilities
Verification of Speech Data
Validate speech data related to STT, TTS and wake-up word detection to ensure accuracy and consistency.
TTS Data Collection Strategy & Execution
Design and implement data collection strategies that reflect North American linguistic and cultural characteristics.
Secure high-quality English, Spanish, and French text and speech data from diverse sources (e.g., online media, audio archives, user interviews).
Data Quality Control
Review collected data for pronunciation, intonation, grammar, and vocabulary accuracy to ensure suitability for model training.
Perform outlier detection and data cleaning tasks (e.g., noise removal, audio clipping, text normalization).
Process Automation & Optimization
Develop scripts and tools (using Python, R, etc.) to automate repetitive tasks in data collection and verification.
Build and manage data pipelines and propose workflow improvements to optimize the process.
Outsourcing Management
Oversee and manage outsourcing agencies responsible for speech data labeling, ensuring adherence to quality standards and deadlines.
Collaboration & Communication
Work closely with development teams, speech engineers, and language experts to set data quality standards and project objectives.
Provide regular reports on project progress, challenges, and improvement measures.
Market & User Analysis
Analyze language usage trends, dialects, and intonation patterns in North America to continuously refine data collection strategies.
Incorporate user feedback and emerging research trends to update and improve the datasets.
Qualifications
Experience
Over 3 years (or equivalent experience) in voice signal-related roles, including speech data verification, labeling, and managing outsourcing agencies.
Proven experience in collecting and validating speech data for various audio signal tasks.
Educational Background
Bachelor’s degree in Linguistics, Speech Signal Processing, Computer Science, Data Science, or a related field.
A Master’s degree or higher with relevant research experience is preferred.
Language & Communication Skills
Native-level proficiency in at least two of the following languages—English, French, and Spanish—is essential.
A strong understanding of North American dialects and cultural nuances is required.
Excellent documentation, presentation, and teamwork skills.
Professional-level Korean language proficiency is an asset for research collaboration.
Project Management & Problem-Solving
Strong analytical, problem-solving, and project management skills, with the ability to handle multiple tasks and set priorities effectively.
Preferred Qualifications
Specialized Industry Experience
Proven track record in quality control and management of audio data labeling projects.
Strong understanding of technologies related to STT, TTS and wake-up word detection.
Project experience in this field and hands-on experience with deep learning frameworks such as TensorFlow and PyTorch.
Data Management Expertise
Experience in building and managing large-scale multi-modal (text + speech) datasets and optimizing data cleaning processes.
Sound Engineering & Narration Directing Expertise
Demonstrated experience in sound engineering, including designing and optimizing acoustic environments, implementing advanced audio processing techniques, and ensuring high-quality sound production for various applications.
Proven track record in narration directing, managing voice talent, and providing creative guidance to ensure that voice-over projects align with brand or project objectives.
Proficiency in using industry-standard audio production tools and software (e.g., Pro Tools, Adobe Audition) is highly desirable.
Professional & Academic Engagement
Active participation in industry conferences, seminars, or workshops, with contributions to patents, academic publications, or open-source projects.
Certifications
Relevant certifications in cloud services, data engineering, or machine learning (e.g., AWS Certified Solutions Architect, Google Cloud Professional Data Engineer) are a plus.
Interview Process
Application Review - Coding test - 1st interview - 2nd interview - 3rd interview - Offer Negotiation - Hiring
Screening procedures may be operated differently for each job and may vary depending on the schedule and situation.
The screening schedule and results will be notified individually by email registered on the application form.
Additional Information
In accordance with fair hiring practices, do not include any personal information unrelated to your job qualifications (e.g., Social Security Number, family relations, marital status, age, photo, physical condition, place of birth, etc.) in your resume.
All documents must be submitted in PDF format and under 30MB in size.
If you experience issues uploading your resume, please send it along with the job posting URL to recruit@42dot.ai.
We strongly encourage applications from U.S. veterans and candidates eligible for employment preference under applicable laws.
Qualified individuals with disabilities are encouraged to apply and will receive consideration under the Americans with Disabilities Act (ADA).
42dot does not accept unsolicited resumes and will not pay fees for any such submissions. Equal Opportunity Statement
42dot is an Equal Opportunity Employer. We celebrate diversity and are committed to creating an inclusive environment for all employees, regardless of race, color, religion, sex, sexual orientation, gender identity, national origin, age, disability, or veteran status.
※ Please review the following information before applying
How 42dot works, About 42dot Way →